Stephen Bourne wanted to write his shell in ALGOL so badly that he relentlessly beat C with its own preprocessor until it began to resemble his preferred language.
Can someone clarify whether this is intended as a joke or whether the author is actually confused? I mean, nothing about this makes sense: it's not "scripting"; it claims to introduce "strong typing" while it does nothing about typing; it introduces all kinds of operator aliases "modeled after Lua and Lisp" that are present in neither of these languages. But it's not an obvious parody either, so I'm genuinely not sure.
I mean he has to be serious, right: "Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!!"
With C23 (nullptr, auto typing, typeof) and C11 (generics) it got more guarantees and type-related primitives. You can still do void*, but you are strongly discouraged from it.
#include "pretty.h"
void print_int(int value){
println(value);
}
int main (int argc, string argv[])
{
long value = 23849234723748234;
print_int(value);
}
How is this strongly typed?
$ cc test.c -o test && ./test
-1411401334
And to be clear, weak vs strong isn't a boolean property but a spectrum, but would be hard to argue with a straight face than C is a strongly typed language.
C is likely the only example of a programming language that is clearly statically typed while at the same time being weakly typed. For a reason: as your example shows, it's a really bad idea (but understandable for a language from the 60's).
I don't think Java is weakly type even in generics. You can't "fake" your way with types like in C, you need to explicitly cast, which fails if you try to make an invalid type cast.
C is strongly typed in some areas in that ISO C requires a diagnostic if you mistakenly use a struct foo * where a struct bar * is required.
It's weak in many areas, such as, oh, that you can implicitly convert an out-of-range floating point value to an integer type and get undefined behavior.
Linkage in C is not type safe. An extern int x declaration in one translation unit can be matched with an extern double x = 0.0 definition in another. Linkers for c typically accept that without a diagnotic: the program links.
Given the idea behind this repo is to cause pain, why not add a shebang to your file [0] to make it executable.
I saw a blog post a long time ago that went into the details of how ./foo worked, and how it executed an elf file. You could register `.c` programs in the same way to be compiled and run?
Now I have a very evil idea: what about registering a binfmt handler for the header bytes “#include”? Sure, it doesn’t handle all C/C++ programs (notably any program that dares to start with a comment), but it would not require modifying any source code!
(For even more insanity I guess you could also trigger on // and /*, although there’s some risk of false positives then!)
Sure, there is no "rule" against it. But words/phrases have commonly-accepted meanings and willfully ignoring or appropriating those meanings implies either cultural ignorance or a concealed agenda.
If you want to insist that scripting languages can be either compiled or interpreted, then its better to just drop it altogether and just say "language" because the "scripting" part has utterly lost its identity at that point.
generally they aren't, as scripting usually implies an interpreter, though no one is stopping you from using a wrapping script that quietly compiles on first run and caches a bunch of executables somewhere. not much different than python producing bytecode files as it goes along.
Script usually implies some kind of task that runs once and the exits. As opposed to a system that is expected to run indefinitely.
There are good reasons for why scripts are often interpreted and why systems are often compiled, but that's not what defines them. There are definitely scripts that are compiled and systems that are interpreted out in the wild.
the original definition is likely tossing shell commands in a file to run later. chaining commands together. since perl and python supplanted this, they get lumped in as 'scripting languages'. both certainly can be used to write long running systems or short one off tasks.
compiled languages are rarely used for one offs because the effort they require is usually greater than the task calls for.
a big part of perl/python use is in tying together libraries written in more difficult lower level compiled languages.
you'll also see scripting used to refer to languages embedded in larger projects. lua scripts to control entities in a game, for instance. do they compile these somehow? I never did in the little project I used lua for.
----
all of that together, I expect that scripting as a concept largely boils down to conceptually simpler languages with less view of the ugly underbelly of how things actually work in a computer, used to chain together abstractions created by lower level code.
scripting is duct-tape. whether you duct-tape together a one-off task or some wad of long running functionality is besides the point.
> you'll also see scripting used to refer to languages embedded in larger projects.
Yes, but this is conceptually exactly the same as the aforementioned shell scenario. This is not something different.
Just as I suspected, there is only one definition, and one that has proven to actually be well defined to boot as you managed to reiterate the only definition I have ever known to perfection.
Well, there's a few things I should probably get around to adding to CNoEvil[0] and ogw[1]... There always seem to be more every few months when this project reappears.
What do you consider the type of shell text, i.e. what's in argv and what you get from subprocess output? It's not well-formed utf8 strings because any random garbage can be in there, yet tools like awk and grep are ubiquitous.
I'd argue that strings and bytes are the same general type, but it's sometimes useful to give well-formed utf8 bytes a different type internally. Rust gets this mostly correct with OsString and String.
The way I understand it: Bytes are just bytes, until you provide an encoding. Then they can be can be converted to a string, if validly encoded. Taking an array of characters and just treating it or casting it as a string is usually a bad idea.
The thing I think Rust maybe goofed, or at least made a little complicated, is their weird distinction between a String and a str (and a &str). As a newbie learning the language, I have no idea which one to use, and usually just pick one, try to compile, then if it fails, pick the other one. I'm sure there was a great reason to have two types for the same thing, that I will understand when I know the language better.
If you want to understand more deeply, the Rust Programming Langauge, chapter 4, uses String and &String and &str to talk about ownership and borrowing. Here’s a link to the start of that chapter: https://doc.rust-lang.org/stable/book/ch04-00-understanding-...
Your blog post is practical and clearly explains what to do, when, which is helpful. What's confusing is why Rust has the two types and why the language designers decided it was a good idea to have to convert back and forth between them depending on whether it was going in a struct or being passed as an argument. I suppose the "why" is probably better found in the Rust docs.
As a long-time C++ user, it seems like std::string vs const char* all over again, and we somehow didn't find a better way.
Yep, that’s exactly it: I wanted to focus purely on what to do, rather than weigh it down with what’s already in the Rust book.
It’s closer to std::string and std::string_view. But yes, in a language with value and reference semantics, when you also care about performance, you just can’t do any better: you need both types. Or at least, if you want the additional correctness guarantees and safety provided by communicating ownership semantics in the type. C gets away with just char * but then you have to read the docs to figure out what you’re allowed to do with it and what your responsibilities are.
A Rust `String` reference (i.e. &String) can always be passed where `&str` is expected because `String` has a `Deref<Target=str>` impl... in that sense they don't just appear similar, they are polymorphic.
There may not be a single encoding for every byte in a string. The encoding may not be knowable ahead of time. You might be trying to extract strings from a random blob of bytes with unknown origin. There's a thousand and one different variations.
To give a real example, I once wrote some python scripts to parse serial messages coming off a bus. They'd read the messages, extract some values with regex, and move on.
Unfortunately the bus had some electrical bugs and would intermittently flip random bits with no CRC to correct them. From my point of view, no big deal. If it's in something outside the fields I care about, I won't notice it. If it's flipped something I do care about we have a bad sample to drop or noise the signal processing will deal with. Either way, it's fine. Python on the other hand cared very much. I rewrote everything in C once I got sufficiently annoyed of dealing with it and more importantly explaining to others how they couldn't "simplify" things using the stdlib APIs.
// NB: Must be utf-8!
struct string {
size_t sz;
size_t capacity;
unsigned char *buffer;
};
&String in Rust is roughly like `const struct string *`.
str in Rust is just an array of (guaranteed utf-8) unsigned bytes. It does not have a capacity, so it can't be resized. You can't directly construct one (on the stack), because its size is undetermined and Rust doesn't have dynamic-sized stack allocation.
&str, and Box<str>, are pointers to str, along with a size, and are roughly like this C:
// NOTE: Must be utf-8!
struct str_ptr
{
size_t sz;
unsigned char *buffer;
}
The difference between &str and Box<str> is that the latter is an owned pointer to a heap allocation which will be freed when it goes out of scope. &str is unowned and might point anywhere: to a Box<str> on the heap, to a String on the heap, or to read-only static memory.
IMO, it's probably easier to first try to understand the difference between `Vec<u8>`, `&[u8]`, and `&Vec<u8>`, because they are slightly less "weird" than the string types: they aren't syntactically special like `str` is[1], and they don't have an implicit requirement to be utf8 that is inexpressible in the type system.
[1]: `str` is syntactically special because it is basically a slice, but isn't written in slice notation.
String/str are both valid UTF-8 by definition, though. Plain ol' piles of bytes in Rust are generally represented by Vec<u8>/[u8].
Rust could have done better in naming, but a definite design goal of the language (for better and worse) is to not make things that are complicated for the compiler appear simple to the user. Which unfortunately results in:
What you see on the screen of a terminal is Unicode strings. It is human readable text. len(“”) is 3 even if the underlying encoding holds it as 6 bytes.
Of course if you provide a separate set of functions for treating a string as human readable vs not you can also work with that. Basically len() vs byte_len().
But you can’t concat two human readable strings without ensuring they are of the same encoding. You can’t search a string by bytes if your needle is of a different encoding. You can’t sort without taking encoding and locale preferences into account, etc.
Pretending like you don’t care about encoding doesn’t work as we have seen time and again.
Given the nature of it (pretty.c) and the stated intention of being "backwards-compatible with C and all of its libraries", what would make more sense than sticking with C's multibyte strings?
At the language level C historically hasn't offered much support for working with specific character sets and their encodings. With C17 and C23 we get u"...", U"...", u8"...", type char8_t, and similar, but there's still little/no built-in tooling for text processing.
For text processing work with char* whose bytes are some encoding/s of Unicode, e.g. UTF-8, then you an use a C library such as libunistring or ICU.
However the bytes of a char* could instead be an encoding of a non-Unicode character set, e.g. GB2312 encoded as EUC-CN.
So char* is character set and encoding agnostic. And C-the-language doesn't even try to offer you tools for working with different sets and encodings. Instead, you can use a library or write your own code for that purpose.
A number of languages make the same decision, keeping the string type set/encoding agnostic, with libraries taking up the slack.
In Nim, for example, the string type is essentially raw bytes (string literals in .nim sources are UTF-8). If you're doing Unicode text processing then you'd use facilities from the std/unicode module
Returning to the matter of pretty.c, since it's just sugar for C, it makes sense (to me) that the string type is just an alias for the set/encoding agnostic char*. It's up to the programmer to know and decide what the bytes represent and choose a library accordingly.
I don't agree. This doctrine presumes all of the following:
- String data will be properly encoded
- There is one encoding of strings (UTF-8 usually)
- Validation must occur when string data is created
- Truncating a logical codepoint is never acceptable
- You may not do string things to "invalid" bytes
- Proper encoding is the beginning and the end of validation
None of these things are consistently true. It's a useful practice to wrap validated byte sequences in a type which can only be created by validation, and once you're doing that, `Utf8String` and `EmailAddress` are basically the same thing, there's no reason to privilege the encoding in the type system.
Reminds me of a C++ codebase I once had to inspect that was entirely written as if it were written in Java. With camelcase naming for everything, getters and setters for every class variable, interfaces everywhere.
Also because the special characters were (and are) difficult to type on European keyboards.
Characters like []{}\|~ are behind multi-finger access and often not printed at all on the physical keys (at least in the past). You can see how this adds a hurdle to writing C…
Pascal was designed by a European, so he preferred keywords which could be typed on every international keyboard. C basically just used every symbol from 7-bit ASCII that happened to be on the keyboards in Bell Labs.
Just as example, on my slovenian QWERTZ layout: [ - altgr+f, ] - altgr+g, { - altgr+b, } - altgr+n, \ - altgr+q, | - altgr+w, ~ - altgr+1.
You get used to them, though you start feeling like a pianist after a short coding session. The one most annoying for me are the fancy javascript/typescript quotes, which I have to use all too often: ` - altgr+7.
I tried switching to US a few times, but every time muscle memory made me give up soonish - especially since there are big benefits to using same keyboard layout as other people in your office are using.
Also practically everytime I need to write a comment, commit message or email I need my č, š and ž. It's kinda nice to have them only a single keypress away.
I'm from a non-English country. I only ever use layout of my locale when I write in my language. That's how it was ever since I was a kid who knew little English. And that's how all computers I've encountered in my country are set up - English first, local second.
In addition, our layout, overwrites only the numerics – all other symbols are the same as on a US layout.
There’s so many assumptions here about a person who’s starting to learn programming.
For starters, that they’re on Linux, they feel comfortable running complex CLI commands, they can memorize the U.S. layout just like that, and that they can type without looking at the physical keys (because changing the virtual mapping means keys produce something else than what the label says).
In reality, the learner’s first exposure to C family languages is more likely to be a website where you can run some JavaScript in a text box. And the first hurdle is to figure out how to even type {}. American developers just completely forget about that.
I’ve been writing C and its progeny (C++, JavaScript, Rust etc.) since 1990 on a Finnish keyboard.
The AltGr brackets are fine. The truly annoying character to type is the backtick (which is a quite new addition to the pantheon of special characters, C doesn’t use it).
My personal opinion is that Niklaus Wirth had the better overall ideas about clarity and inclusiveness in programming language design, but that battle is long lost. (What you consider the character set needed for "proper programming" is really a relatively new development, mid-1990s and later.)
Backticks were fairly important for shell scripting in the past, but have officially been replaced with $(), which can be nested.
My intuition is that Perl would be the most challenging on a keyboard where it's harder to type unusual punctuation, since it feels like a very punctuation-heavy language, but I don't know whether it actually uses more than C (I think the backtick has a shell-style meaning in Perl too).
Well unless opting for something like Dvorak, you are indeed doomed to something that was specificcaly designed to please typewriter mechanical constraints without much care for the resulting ergonomics.
I use a Bépo layout personally, on a Typematrix 2030 most of the time, as French is my native language.
or maybe popular proglangs were designed for writing on USAn press/office keyboards – remember that UNIX came to be as a typesetting appliance — disregarding anyone else.
Spectacular?? Terrifying. If I need to type non-ASCII Latin characters I'll just use compose sequences. The thought of a non-U.S. keyboard layout with modifiers required to type []{}<> and so on is terrifying.
> camelcase naming for everything, getters and setters for every class variable, interfaces everywhere
This is not far off from the guidelines in many cases, e.g. Windows code (well, not every variable of course.) A lot of Java design was copied from C++.
I've seen similar codebases as well written by people who have spent way too much time with Java. One even had it's own String class which was just a wrapper for std::string with Java-like methods.
Type names are nice; Perfect choice for the in-built func macros (like min); Len -- love it. Named boolean operators -- might be a bit much but go for it; Ternaries are illegible so you can only improve them; Not completely sold on all your loop definitions but some make sense to me; Resource tracking is impressive; The for... look a bit ugly -- could probably call it something else.
All in all: quite a solid attempt. I'll give you 8/10 for the design of this. The way you sketched this out in C using macros is really elegant. This actually looks like good code. Would I use it? It's a new language and I like C already. It could help people learn C and think about language design. Since the way you've done this is very clear.
Well, you don't have to use it all. My projects mostly use booleans, len(), max/min, and some operator aliases, because there wasn't much need for other tasty stuff yet. So give it a shot, even if for a couple of operator macros!
> The word "REPEAT" should not be used in place of "SAY AGAIN", especially in the vicinity of naval or other firing ranges, as "REPEAT" is an artillery proword defined in ACP 125 U.S. Supp-2(A) with the wholly different meaning of "request for the same volume of fire to be fired again with or without corrections or changes" (e.g., at the same coordinates as the previous round).
Yes, code blocks in Org are executable, but I was aiming for simple embedding and zero build-time, thus conservative choice of separating README and the actual header.
I was hoping to see a “this is just for fun” disclaimer but didn’t see one. Please never actually use this in a project that other people will have to read or contribute to.
This project looks really cool! Unfortunately, there’s just way too much magic involved. In my humble opinion, C is simply not the language for this level of magic—extreme use of macros (and hidden behavior in general) is how you end up with hard-to-detect (and hard-to-debug) bugs and security vulnerabilities. As cool as this project looks, I’d never feel comfortable using it in anything serious. A+ for the effort though!
You can use the Boehm-Demers-Weiser GC with C. It's conservative, because it has to be with C, so it may/will miss things (it will treat integers etc. as potential pointers, and so avoid freeing anything "pointed to" by them), and so it works best as an extra layer of protection/leak detector, but it can be used as a replacement for freeing memory too.
I feel compelled to try it out in a serious way and contribute to it. I have strong knowledge of python and am learning C. Are there good reasons -apart from attracting the ire of c-programmers- to not use it?
Does "strong typing" now just mean "static typing"? Afaik both lua and python are already strongly typed. Javascript is not and I have no clue about ruby.
> Does "strong typing" now just mean "static typing"?
The distinction strong and weak typing is irrelevant in practice.
Weak (but present) static typing beats strong dynamic typing every single time, because what is valuable is NOT"Do I see a type mismatch error only when a user accesses it?", it's "does this mismatch prevent a deployment?"
IOW, the only distinction in production is dynamic typing vs static typing, not strong typing vs weak typing.
> because what is valuable is NOT "Do I see a type mismatch error only when a user accesses it?", it's "does this mismatch prevent a deployment?"
I argue that understanding the semantics clearly and unambiguously is the most relevant thing, and strong typing tends to do that better (imho—with my only other examples being javascript and the "stringly-typed" perl and some random irrelevant BASIC dialects).
> Weak (but present) static typing beats strong dynamic typing every single time,
Can you give me an example? I don't think I've ever heard of such a thing. The closest I can think to this is maybe, arguably, the use of `void*` pointers in C which is difficult to see as anything other than pragmatism rather than some deeply beneficial way to write code—even explicit casts produce much more readable code. Another argument I could see is for operator overloading, which (IMO) produces much less readable code, or the implicit conversions feature of Scala (which also, IMO, produces less readable code, but they've addressed a lot of the major problems with it).
This code is incorrect, but I don't blame them. :) Probably one of the most common float-related mistakes, even among people who "know how floats work".
FLT_EPSILON is the difference between 1.0 and the next larger float. It's impossible for numbers less than -2.0 or greater than 2.0 to have a difference of FLT_EPSILON, they're spaced too far apart.
You really want the acceptable error margin to be relative to the size of the two numbers you're comparing.
Also, everyone should read the paper "What, if anything, is epsilon?" by Tom7
I would go even further and say that any equality comparison of float numbers has to be a one-off special case. You need to know how much error can arise in your calculations, and you need to know how far apart legitimately different numbers will for your particular data. And of course the former has to be smaller than the latter.
Indeed, FLT_EPSILON is not a one-size-fits-all solution, but it's good enough for frequent case of comparing big enough numbers, which is not covered by regular ==. So it's a convenience/correctness trade-off I'm ready to make.
If the numbers you are comparing are greater than 2, abs(a - b) < FLT_EPSILON is equivalent to a == b. Because it's not possible for two large numbers to not be equal, but also closer together than FLT_EPSILON.
But it's impossible to have a number that's 0.00000011920929 less than 5.0, or 0.00000011920929 more than 5.0, because the floats with enough magnitude to represent 5 are spaced further apart than that. Only numbers with magnitude < 2 are spaced close enough together.
In other words, the only 32-bit float that's within ±0.00000011920929 of 5.0 is 5.0 itself.
Picking out an obvious define function that compares a float with a float sum of that nature should indicate an good understanding of why that might be called wizardry and deserving of a second look.
Hats off to the peer comment that suggested scaling against epsilon rather than simpliy regurging the substitution "as was" from the header.
The scaling is better in general, optional in some specific contexts.
it uses absolute difference epsilon equality ('close enough to be considered equal'):
static int pretty_float_equal (float a, float b) { return fabsf(a - b) < FLT_EPSILON; }
static int pretty_double_equal (double a, double b) { return fabs(a - b) < DBL_EPSILON; }
static int pretty_long_double_equal (long double a, long double b) { return fabsl(a - b) < LDBL_EPSILON; }
Sorry for what is probably a stupid question. Does pretty.c act as a preprocessor or sorts, converting your pretty.c script into actual c, that is then compiled? Or is it a virtual machine that interprets your pretty.c script?
It's a set of C Preprocessor macros, so you don't even need to somehow process the code—you just #include the header and hack away as if it was regular C!
- Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.
- Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!
- Including only one header (yes, Pretty C is a header-only library #include-able from arbitrary C file!) to turn any codebase into a beginner friendly one.
It’s not a preprocessor or compiler. There’s no binary. It’s just a bunch of C code in a header: macros, functions, etc. that you can use to write different looking programs but that are still C.
I've seen this implementation of defer a few times now. I really dislike calling this defer (like the keyword in Go) as the given code won't be executed on return.
No no, I think you misunderstood my critic. Defer working on block-scope is fine; however, if I exit the block through a return (or break), the deferred function is not called.
To my knowledge, you need a compiler extension to implement this in C (or use a C++ destructor).
As someone who just got diagnosed with type-1 diabetes (the auto-immune variety, not the “you eat too much sugar” variety), this was far more depressing than funny. I’m probably being overly-sensitive, but man my life has gone to shit in the last couple of years…
There’s nothing wrong with my diet, that’s type-2. Type-1 is an auto-immune disease, and in my case was triggered by a couple of years of immense stress. Cortisol and the like wreak havoc over the long term. Mine isn’t curable, I don’t have many insulin-producing cells left in my pancreas.
All because two years ago, a hospital tried to make a little bit more money by turning beds faster, and fed sodium into my wife about 2x the recommended rate. Although awake, she has never recovered from the coma, and of course she has brain damage. It’s not like in the movies where you just wake up.
I’m not going into details, but when you scream yourself hoarse from the agony in your limbs, and there’s no painkiller that works, and eventually you lose your voice but you’re still screaming, silently… Yeah, that’s pretty terrible.
Neither of us will recover. I, at least, can manage it with insulin injections/pumps/CGM-patches… She is currently in a “mental health facility”, something she’s had happen to her several times this year. It looks pretty on the outside and resembles a prison once you step beyond the secured entry. It’s where hope goes to die.
All because a “hospital” wanted to make even more money than normal, and didn’t give a shit about their actual patients. Yes, I’m bitter.
I appreciate the sentiment behind your words, thank you for that. It won’t help, but still.
> Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!
Creating DSLs within C has a long tradition.
Stephen Bourne wanted to write his shell in ALGOL so badly that he relentlessly beat C with its own preprocessor until it began to resemble his preferred language.
https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
Here is an example of what we wrote using it:
https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
This is not even half as bad as I expected it to be.
excellent lore. said to have been the inspiration for the obfuscated C code contest.
https://www.ioccc.org/
Can someone clarify whether this is intended as a joke or whether the author is actually confused? I mean, nothing about this makes sense: it's not "scripting"; it claims to introduce "strong typing" while it does nothing about typing; it introduces all kinds of operator aliases "modeled after Lua and Lisp" that are present in neither of these languages. But it's not an obvious parody either, so I'm genuinely not sure.
I mean he has to be serious, right: "Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!!"
Author here. I don't see any problem with this
Well, as a starter C is rarely considered as "strongly typed". Statically typed yes, but strongly typed not so much.
With C23 (nullptr, auto typing, typeof) and C11 (generics) it got more guarantees and type-related primitives. You can still do void*, but you are strongly discouraged from it.
C is likely the only example of a programming language that is clearly statically typed while at the same time being weakly typed. For a reason: as your example shows, it's a really bad idea (but understandable for a language from the 60's).
C is from the 1970s.
Java is weakly typed in its generics, despite being statically typed. I’m sure there are more examples.
I don't think Java is weakly type even in generics. You can't "fake" your way with types like in C, you need to explicitly cast, which fails if you try to make an invalid type cast.
The more pedantic compiler flags you introduce, the more strongly typed it becomes.
C is strongly typed in some areas in that ISO C requires a diagnostic if you mistakenly use a struct foo * where a struct bar * is required.
It's weak in many areas, such as, oh, that you can implicitly convert an out-of-range floating point value to an integer type and get undefined behavior.
Linkage in C is not type safe. An extern int x declaration in one translation unit can be matched with an extern double x = 0.0 definition in another. Linkers for c typically accept that without a diagnotic: the program links.
That's pretty clearly said jokingly
I do not at all think the author is confused. Being confused is OK though.
It claims to be a scripting language but you still have to compile the programs. Boo! Add CINT (https://root.cern.ch/root/html534/guides/users-guide/CINT.ht...) and you can have instantaneous execution and even a REPL!
Given the idea behind this repo is to cause pain, why not add a shebang to your file [0] to make it executable.
I saw a blog post a long time ago that went into the details of how ./foo worked, and how it executed an elf file. You could register `.c` programs in the same way to be compiled and run?
[0] https://gist.github.com/jdarpinian/1952a58b823222627cc1a8b83...
Now I have a very evil idea: what about registering a binfmt handler for the header bytes “#include”? Sure, it doesn’t handle all C/C++ programs (notably any program that dares to start with a comment), but it would not require modifying any source code!
(For even more insanity I guess you could also trigger on // and /*, although there’s some risk of false positives then!)
I'd prefer just using tcc [0]. Far lighter weight than that monster. And C, not C++.
[0] https://bellard.org/tcc/tcc-doc.html
Cern uses cling now (https://github.com/root-project/cling)
Well, who said that scripting language cannot be compiled? And yeah, Clang-REPL is another way to make it REPL-friendly.
Sure, there is no "rule" against it. But words/phrases have commonly-accepted meanings and willfully ignoring or appropriating those meanings implies either cultural ignorance or a concealed agenda.
If you want to insist that scripting languages can be either compiled or interpreted, then its better to just drop it altogether and just say "language" because the "scripting" part has utterly lost its identity at that point.
generally they aren't, as scripting usually implies an interpreter, though no one is stopping you from using a wrapping script that quietly compiles on first run and caches a bunch of executables somewhere. not much different than python producing bytecode files as it goes along.
Script usually implies some kind of task that runs once and the exits. As opposed to a system that is expected to run indefinitely.
There are good reasons for why scripts are often interpreted and why systems are often compiled, but that's not what defines them. There are definitely scripts that are compiled and systems that are interpreted out in the wild.
'scripting' is an ill-defined term with many interpretations, certainly.
If that is the case, pick another interpretation and describe to us what "non-scripting" then might be.
the original definition is likely tossing shell commands in a file to run later. chaining commands together. since perl and python supplanted this, they get lumped in as 'scripting languages'. both certainly can be used to write long running systems or short one off tasks.
compiled languages are rarely used for one offs because the effort they require is usually greater than the task calls for.
a big part of perl/python use is in tying together libraries written in more difficult lower level compiled languages.
you'll also see scripting used to refer to languages embedded in larger projects. lua scripts to control entities in a game, for instance. do they compile these somehow? I never did in the little project I used lua for.
----
all of that together, I expect that scripting as a concept largely boils down to conceptually simpler languages with less view of the ugly underbelly of how things actually work in a computer, used to chain together abstractions created by lower level code.
scripting is duct-tape. whether you duct-tape together a one-off task or some wad of long running functionality is besides the point.
> you'll also see scripting used to refer to languages embedded in larger projects.
Yes, but this is conceptually exactly the same as the aforementioned shell scenario. This is not something different.
Just as I suspected, there is only one definition, and one that has proven to actually be well defined to boot as you managed to reiterate the only definition I have ever known to perfection.
> Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.
Haha love this!
Well, there's a few things I should probably get around to adding to CNoEvil[0] and ogw[1]... There always seem to be more every few months when this project reappears.
[0] https://git.sr.ht/~shakna/cnoevil3/
[1] https://git.sr.ht/~shakna/ogw
"It takes a whole lot of bad ideas and mashes them into an abhorrent monstrosity."
I love this to the very core of my being.
For what it’s worth this makes the same mistake that Python 2 did: string and bytes are not the same type and shouldn’t be treated as such.
What do you consider the type of shell text, i.e. what's in argv and what you get from subprocess output? It's not well-formed utf8 strings because any random garbage can be in there, yet tools like awk and grep are ubiquitous.
I'd argue that strings and bytes are the same general type, but it's sometimes useful to give well-formed utf8 bytes a different type internally. Rust gets this mostly correct with OsString and String.
The way I understand it: Bytes are just bytes, until you provide an encoding. Then they can be can be converted to a string, if validly encoded. Taking an array of characters and just treating it or casting it as a string is usually a bad idea.
The thing I think Rust maybe goofed, or at least made a little complicated, is their weird distinction between a String and a str (and a &str). As a newbie learning the language, I have no idea which one to use, and usually just pick one, try to compile, then if it fails, pick the other one. I'm sure there was a great reason to have two types for the same thing, that I will understand when I know the language better.
I wrote a blog post that may help you! https://steveklabnik.com/writing/when-should-i-use-string-vs...
If you want to understand more deeply, the Rust Programming Langauge, chapter 4, uses String and &String and &str to talk about ownership and borrowing. Here’s a link to the start of that chapter: https://doc.rust-lang.org/stable/book/ch04-00-understanding-...
How timely and helpful, thanks!
Your blog post is practical and clearly explains what to do, when, which is helpful. What's confusing is why Rust has the two types and why the language designers decided it was a good idea to have to convert back and forth between them depending on whether it was going in a struct or being passed as an argument. I suppose the "why" is probably better found in the Rust docs.
As a long-time C++ user, it seems like std::string vs const char* all over again, and we somehow didn't find a better way.
Yep, that’s exactly it: I wanted to focus purely on what to do, rather than weigh it down with what’s already in the Rust book.
It’s closer to std::string and std::string_view. But yes, in a language with value and reference semantics, when you also care about performance, you just can’t do any better: you need both types. Or at least, if you want the additional correctness guarantees and safety provided by communicating ownership semantics in the type. C gets away with just char * but then you have to read the docs to figure out what you’re allowed to do with it and what your responsibilities are.
Rust has two different types because they are fundamentally different things, just like `std::string` and `const char *` are!
A pointer to some memory is not the same thing as a struct that has a pointer to memory, as well as a capacity field and the ability to resize itself.
In C++ terms, String is std::string, &str is std::string_view. They're different things, but they can appear similar.
A Rust `String` reference (i.e. &String) can always be passed where `&str` is expected because `String` has a `Deref<Target=str>` impl... in that sense they don't just appear similar, they are polymorphic.
There may not be a single encoding for every byte in a string. The encoding may not be knowable ahead of time. You might be trying to extract strings from a random blob of bytes with unknown origin. There's a thousand and one different variations.
To give a real example, I once wrote some python scripts to parse serial messages coming off a bus. They'd read the messages, extract some values with regex, and move on.
Unfortunately the bus had some electrical bugs and would intermittently flip random bits with no CRC to correct them. From my point of view, no big deal. If it's in something outside the fields I care about, I won't notice it. If it's flipped something I do care about we have a bad sample to drop or noise the signal processing will deal with. Either way, it's fine. Python on the other hand cared very much. I rewrote everything in C once I got sufficiently annoyed of dealing with it and more importantly explaining to others how they couldn't "simplify" things using the stdlib APIs.
Python stdlib conveniently supports both byte strings and Unicode strings, even for regexps. Ther is no need to migrate to any other language.
String in rust is roughly like this in C:
&String in Rust is roughly like `const struct string *`.str in Rust is just an array of (guaranteed utf-8) unsigned bytes. It does not have a capacity, so it can't be resized. You can't directly construct one (on the stack), because its size is undetermined and Rust doesn't have dynamic-sized stack allocation.
&str, and Box<str>, are pointers to str, along with a size, and are roughly like this C:
The difference between &str and Box<str> is that the latter is an owned pointer to a heap allocation which will be freed when it goes out of scope. &str is unowned and might point anywhere: to a Box<str> on the heap, to a String on the heap, or to read-only static memory.IMO, it's probably easier to first try to understand the difference between `Vec<u8>`, `&[u8]`, and `&Vec<u8>`, because they are slightly less "weird" than the string types: they aren't syntactically special like `str` is[1], and they don't have an implicit requirement to be utf8 that is inexpressible in the type system.
[1]: `str` is syntactically special because it is basically a slice, but isn't written in slice notation.
String/str are both valid UTF-8 by definition, though. Plain ol' piles of bytes in Rust are generally represented by Vec<u8>/[u8].
Rust could have done better in naming, but a definite design goal of the language (for better and worse) is to not make things that are complicated for the compiler appear simple to the user. Which unfortunately results in:
What you see on the screen of a terminal is Unicode strings. It is human readable text. len(“”) is 3 even if the underlying encoding holds it as 6 bytes.
Of course if you provide a separate set of functions for treating a string as human readable vs not you can also work with that. Basically len() vs byte_len().
But you can’t concat two human readable strings without ensuring they are of the same encoding. You can’t search a string by bytes if your needle is of a different encoding. You can’t sort without taking encoding and locale preferences into account, etc.
Pretending like you don’t care about encoding doesn’t work as we have seen time and again.
Given the nature of it (pretty.c) and the stated intention of being "backwards-compatible with C and all of its libraries", what would make more sense than sticking with C's multibyte strings?
https://en.cppreference.com/w/c/string/multibyte
Right but pretty.c doesn’t seem to explicitly support those.
How so? it’s just char*
What if len() of a char* vs a Unicode string?
char* is just raw bytes.
At the language level C historically hasn't offered much support for working with specific character sets and their encodings. With C17 and C23 we get u"...", U"...", u8"...", type char8_t, and similar, but there's still little/no built-in tooling for text processing.
For text processing work with char* whose bytes are some encoding/s of Unicode, e.g. UTF-8, then you an use a C library such as libunistring or ICU.
However the bytes of a char* could instead be an encoding of a non-Unicode character set, e.g. GB2312 encoded as EUC-CN.
So char* is character set and encoding agnostic. And C-the-language doesn't even try to offer you tools for working with different sets and encodings. Instead, you can use a library or write your own code for that purpose.
A number of languages make the same decision, keeping the string type set/encoding agnostic, with libraries taking up the slack.
In Nim, for example, the string type is essentially raw bytes (string literals in .nim sources are UTF-8). If you're doing Unicode text processing then you'd use facilities from the std/unicode module
https://nim-lang.org/docs/unicode.html
Same story with Zig
https://ziglang.org/documentation/0.8.0/std/#std;unicode
Lua too, and you'll probably use a 3rd party library such as luaut8 for working with Unicode/UTF-8
https://github.com/starwing/luautf8
Returning to the matter of pretty.c, since it's just sugar for C, it makes sense (to me) that the string type is just an alias for the set/encoding agnostic char*. It's up to the programmer to know and decide what the bytes represent and choose a library accordingly.
I don't agree. This doctrine presumes all of the following:
None of these things are consistently true. It's a useful practice to wrap validated byte sequences in a type which can only be created by validation, and once you're doing that, `Utf8String` and `EmailAddress` are basically the same thing, there's no reason to privilege the encoding in the type system.I mean other languages make it work.
What is your definition of "string"?
If it's "human-readable text", then fine, a string is not the same thing as an arbitrary byte array.
But lots of languages don't enforce that definition.
Well that's the very thing: not enforcing that distinction is the very mistake in question.
Reminds me of a C++ codebase I once had to inspect that was entirely written as if it were written in Java. With camelcase naming for everything, getters and setters for every class variable, interfaces everywhere.
You ain't seen nothin. Check out the bourne shell source code from unix seventh edition. https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd... I can't believe it's not ALGOL.
https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd...
wow.
thanks for this gem.
Wow, I was not expecting that! Was this style of C common back then?
Before he wrote the Bourne shell the author wrote an ALGOL compiler, and ALGOL inspired Bourne syntax:
https://en.wikipedia.org/wiki/ALGOL_68C
There were article suggesting #define BEGIN { and #define end }; to make C look more like Pascal.
I think in Europe C was not as common as other languages at the time so the terseness looked odd.
Also because the special characters were (and are) difficult to type on European keyboards.
Characters like []{}\|~ are behind multi-finger access and often not printed at all on the physical keys (at least in the past). You can see how this adds a hurdle to writing C…
Pascal was designed by a European, so he preferred keywords which could be typed on every international keyboard. C basically just used every symbol from 7-bit ASCII that happened to be on the keyboards in Bell Labs.
Just as example, on my slovenian QWERTZ layout: [ - altgr+f, ] - altgr+g, { - altgr+b, } - altgr+n, \ - altgr+q, | - altgr+w, ~ - altgr+1.
You get used to them, though you start feeling like a pianist after a short coding session. The one most annoying for me are the fancy javascript/typescript quotes, which I have to use all too often: ` - altgr+7.
Today I learned that there exist people who use non-US layouts when coding. That’s spectacular!
I tried switching to US a few times, but every time muscle memory made me give up soonish - especially since there are big benefits to using same keyboard layout as other people in your office are using.
Also practically everytime I need to write a comment, commit message or email I need my č, š and ž. It's kinda nice to have them only a single keypress away.
My hack: use caps key to switch to local keyboard layout while holding it.
Love it! I use ctrl+space to switch, but your idea sounds even better
How did you think people outside the US learn programming?
I'm from a non-English country. I only ever use layout of my locale when I write in my language. That's how it was ever since I was a kid who knew little English. And that's how all computers I've encountered in my country are set up - English first, local second.
In addition, our layout, overwrites only the numerics – all other symbols are the same as on a US layout.
There’s so many assumptions here about a person who’s starting to learn programming.
For starters, that they’re on Linux, they feel comfortable running complex CLI commands, they can memorize the U.S. layout just like that, and that they can type without looking at the physical keys (because changing the virtual mapping means keys produce something else than what the label says).
In reality, the learner’s first exposure to C family languages is more likely to be a website where you can run some JavaScript in a text box. And the first hurdle is to figure out how to even type {}. American developers just completely forget about that.
Installation of Windows and MacOS defaults to US + local layouts.
On the long term, using the native keyboard hinders yourself a lot. I tried to do so with the Spanish (es) layout, it's pretty much unergonomical.
It's looks like being deliberately designed for press/office usage and not for proper programming.
I’ve been writing C and its progeny (C++, JavaScript, Rust etc.) since 1990 on a Finnish keyboard.
The AltGr brackets are fine. The truly annoying character to type is the backtick (which is a quite new addition to the pantheon of special characters, C doesn’t use it).
My personal opinion is that Niklaus Wirth had the better overall ideas about clarity and inclusiveness in programming language design, but that battle is long lost. (What you consider the character set needed for "proper programming" is really a relatively new development, mid-1990s and later.)
Backticks were fairly important for shell scripting in the past, but have officially been replaced with $(), which can be nested.
My intuition is that Perl would be the most challenging on a keyboard where it's harder to type unusual punctuation, since it feels like a very punctuation-heavy language, but I don't know whether it actually uses more than C (I think the backtick has a shell-style meaning in Perl too).
>it's pretty much unergonomical.
Well unless opting for something like Dvorak, you are indeed doomed to something that was specificcaly designed to please typewriter mechanical constraints without much care for the resulting ergonomics.
I use a Bépo layout personally, on a Typematrix 2030 most of the time, as French is my native language.
or maybe popular proglangs were designed for writing on USAn press/office keyboards – remember that UNIX came to be as a typesetting appliance — disregarding anyone else.
Spectacular?? Terrifying. If I need to type non-ASCII Latin characters I'll just use compose sequences. The thought of a non-U.S. keyboard layout with modifiers required to type []{}<> and so on is terrifying.
IIRC, Pascal had/has (* and *) as an alternative to { and } , from the start, or from early on - as syntax for start comment and end comment.
> camelcase naming for everything, getters and setters for every class variable, interfaces everywhere
This is not far off from the guidelines in many cases, e.g. Windows code (well, not every variable of course.) A lot of Java design was copied from C++.
I've seen similar codebases as well written by people who have spent way too much time with Java. One even had it's own String class which was just a wrapper for std::string with Java-like methods.
Good job they weren't using MSVC I guess...
https://learn.microsoft.com/en-us/cpp/cpp/property-cpp?view=...
I had that as well but also Java passes strings in as f(String *) so the C++ code was f(new String("Hello")
I think that's just OOP
You might like https://aartaka.me/oop-c
oooh that was your creation. it makes (barely, I'm stupid) sense now
If you find this interesting, you might like libcello.h aswell! https://www.libcello.org
Yes, it's one of the inspirations!
Type names are nice; Perfect choice for the in-built func macros (like min); Len -- love it. Named boolean operators -- might be a bit much but go for it; Ternaries are illegible so you can only improve them; Not completely sold on all your loop definitions but some make sense to me; Resource tracking is impressive; The for... look a bit ugly -- could probably call it something else.
All in all: quite a solid attempt. I'll give you 8/10 for the design of this. The way you sketched this out in C using macros is really elegant. This actually looks like good code. Would I use it? It's a new language and I like C already. It could help people learn C and think about language design. Since the way you've done this is very clear.
Well, you don't have to use it all. My projects mostly use booleans, len(), max/min, and some operator aliases, because there wasn't much need for other tasty stuff yet. So give it a shot, even if for a couple of operator macros!
You know I expected your macro file to be unreadable moon math. But it actually doesn't look bad.
> ifnt for if(!...).
"unless" seems more readable than "ifnt".
Another bikeshed is the infinite for(;;) loop being called "always"
I've seen "loop" in other languages. But Qt calls it "forever", and that is indeed very pretty. Very Qt, even
for(evernt) {}
> I've seen "loop" in other languages. But Qt calls it "forever", and that is indeed very pretty. Very Qt, even
You can break a "forever" loop so I think "loop" is a better name.
I don’t know why “repeat” isn’t very common in place of while/loop/etc; it works out nicely grammatically.
One possible reason:
> The word "REPEAT" should not be used in place of "SAY AGAIN", especially in the vicinity of naval or other firing ranges, as "REPEAT" is an artillery proword defined in ACP 125 U.S. Supp-2(A) with the wholly different meaning of "request for the same volume of fire to be fired again with or without corrections or changes" (e.g., at the same coordinates as the previous round).
https://en.wikipedia.org/wiki/Procedure_word#Say_again
More seriously, PASCAL has repeat-until loops, similar to do-while loops in C.
Pretty C does aliases "repeat" for "do", so yeah, I've got you covered!
"indefinitely" might be a better name. (But I think loop is indeed a better name.)
Added in commit ef510ca!
I hope you also add a “definitely”, for symmetry.
"loop" added in commit 626408b, thank you!
"forever" added in commit 67ff9ef, thank you!
On the other hand, ifnt is fun to say outloud.
Indeed! But I've reserved "unless" for a ternary conditional, which is more useful anyway.
Oh shit wait, you're John Tromp, BLC creator! I'm a fan!
Is it possible to tangle the Readme into pretty.h? In other words, are the codeblocks in the orgfile exhaustive.
I love the literate way you have explained your thought process in the readme.
Yes, code blocks in Org are executable, but I was aiming for simple embedding and zero build-time, thus conservative choice of separating README and the actual header.
I have not decided how I feel in general, but:
> Everyone defines these, so why not provide them?
Honestly, that's fair.
> turn any codebase into a beginner friendly one
Okay then.
I was hoping to see a “this is just for fun” disclaimer but didn’t see one. Please never actually use this in a project that other people will have to read or contribute to.
> Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.
seems like its obvious to me that its a joke
It's a joke that I would happily use.
C is funny, in many ways.
You're welcome!
Not everything needs to be stated explicitly, where's the fun in that?
No promises, people who want to have fun are going to have fun despite requests not to have fun.
Wow, neat! The wildest part to me is
> And it’s backwards-compatible with C and all of its libraries!
I can't wait to give it a shot! This looks like a riot.
Have you heard of Zig?
It requires a different compiler. This is just a collection of C preprocessor macros
The Zig toolchain can compile both Zig and C.
Yes, but the Zig toolchain is not $YOUR_EXISTING_C_COMPILER_YOU_ALREADY_KNOW_AND_USE
I hadn't yet, but it does look nice. I especially like that you can just say "defer deinit", that's really nice.
This project looks really cool! Unfortunately, there’s just way too much magic involved. In my humble opinion, C is simply not the language for this level of magic—extreme use of macros (and hidden behavior in general) is how you end up with hard-to-detect (and hard-to-debug) bugs and security vulnerabilities. As cool as this project looks, I’d never feel comfortable using it in anything serious. A+ for the effort though!
All that is missing is a garbage collector. Should be possible to implement one by overriding malloc & friends?
You can use the Boehm-Demers-Weiser GC with C. It's conservative, because it has to be with C, so it may/will miss things (it will treat integers etc. as potential pointers, and so avoid freeing anything "pointed to" by them), and so it works best as an extra layer of protection/leak detector, but it can be used as a replacement for freeing memory too.
> if (argc above 1)
I give up.
You're welcome!
This is terrifying
Thanks!
I feel compelled to try it out in a serious way and contribute to it. I have strong knowledge of python and am learning C. Are there good reasons -apart from attracting the ire of c-programmers- to not use it?
Author here. I'll be glad to accept any contribution that makes C more readable, so PR away!
This should have been invented 50 years ago!
Yes, and it's a shame that underlying features were only shipped in C11 (generics) and C23 (auto type inference!)
I'm reminded of the guy who did
and a whole bunch of other macro-larkey just to make C look more like Pascal. Only then would he deign to code in it.https://en.wikipedia.org/wiki/Stephen_R._Bourne
https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh
Now that's just silly. And I see the backwards keyword terminators (LOOP/POOL).
I have wondered why we have case/esac, if/fi but while/done. I imagine the author himself figured that while/elihw would just be entirely ridiculous.
> I have wondered why we have case/esac, if/fi but while/done.
With the reverse-keyword convention we'd get "od", not "elihw", though.
The 'od' utility already existed, apparently, so Bourne opted for "done".[edit: typos]
No, there’s OD. DONE is different but no less perverse.
Just call it wyl/lyw. Pronunciation maintained, problem solved.
Does "strong typing" now just mean "static typing"? Afaik both lua and python are already strongly typed. Javascript is not and I have no clue about ruby.
> Does "strong typing" now just mean "static typing"?
The distinction strong and weak typing is irrelevant in practice.
Weak (but present) static typing beats strong dynamic typing every single time, because what is valuable is NOT "Do I see a type mismatch error only when a user accesses it?", it's "does this mismatch prevent a deployment?"
IOW, the only distinction in production is dynamic typing vs static typing, not strong typing vs weak typing.
> because what is valuable is NOT "Do I see a type mismatch error only when a user accesses it?", it's "does this mismatch prevent a deployment?"
I argue that understanding the semantics clearly and unambiguously is the most relevant thing, and strong typing tends to do that better (imho—with my only other examples being javascript and the "stringly-typed" perl and some random irrelevant BASIC dialects).
> Weak (but present) static typing beats strong dynamic typing every single time,
Can you give me an example? I don't think I've ever heard of such a thing. The closest I can think to this is maybe, arguably, the use of `void*` pointers in C which is difficult to see as anything other than pragmatism rather than some deeply beneficial way to write code—even explicit casts produce much more readable code. Another argument I could see is for operator overloading, which (IMO) produces much less readable code, or the implicit conversions feature of Scala (which also, IMO, produces less readable code, but they've addressed a lot of the major problems with it).
recommend setting up tests for your code s/t failures block deployment. can catch categories of bugs beyond typing errors
Evil, yet beautiful. Hats off to you.
Thanks!
This made me immediately think whether MIT Loop of Common Lisp was an inspiration here. Checked the user's profile and sure enough, a lisper!
Yes, LOOP is a huge inspiration and my favorite programming language!
`equal(0.3, 0.2 + 0.1); // true`
how is this wizardry possible?
It uses type dispatch to perform an epsilon comparison:
So it’s https://docs.python.org/library/math.html#math.iscloseThis code is incorrect, but I don't blame them. :) Probably one of the most common float-related mistakes, even among people who "know how floats work".
FLT_EPSILON is the difference between 1.0 and the next larger float. It's impossible for numbers less than -2.0 or greater than 2.0 to have a difference of FLT_EPSILON, they're spaced too far apart.
You really want the acceptable error margin to be relative to the size of the two numbers you're comparing.
Also, everyone should read the paper "What, if anything, is epsilon?" by Tom7
I would go even further and say that any equality comparison of float numbers has to be a one-off special case. You need to know how much error can arise in your calculations, and you need to know how far apart legitimately different numbers will for your particular data. And of course the former has to be smaller than the latter.
Indeed, FLT_EPSILON is not a one-size-fits-all solution, but it's good enough for frequent case of comparing big enough numbers, which is not covered by regular ==. So it's a convenience/correctness trade-off I'm ready to make.
If the numbers you are comparing are greater than 2, abs(a - b) < FLT_EPSILON is equivalent to a == b. Because it's not possible for two large numbers to not be equal, but also closer together than FLT_EPSILON.
what
am i missing?FLT_EPSILON is not 0.01, it's 0.00000011920929.
But it's impossible to have a number that's 0.00000011920929 less than 5.0, or 0.00000011920929 more than 5.0, because the floats with enough magnitude to represent 5 are spaced further apart than that. Only numbers with magnitude < 2 are spaced close enough together.
In other words, the only 32-bit float that's within ±0.00000011920929 of 5.0 is 5.0 itself.
Oh you're right, thanks for the explanation!
Gotta research now where the 0.00000011920929 number comes from...
It's the distance between 1.0 and the next representable float.
I got that but I am curious how to derive that number.
Is it representable as a non-trivial ratio of integers?
Good question! It's 1/(2**23), because 32 bit floats have 23 bits after the decimal point
Is What Every Computer Scientist Should Know About Floating-Point Arithmetic wrong ??!!
addendum: why are obviously rhetorical questions are taken so literally here?
Because text doesn't convey sarcastic voice tonality, so the intent is far from obvious.
Sarcastic? Okay, if you say so.
Picking out an obvious define function that compares a float with a float sum of that nature should indicate an good understanding of why that might be called wizardry and deserving of a second look.
Hats off to the peer comment that suggested scaling against epsilon rather than simpliy regurging the substitution "as was" from the header.
The scaling is better in general, optional in some specific contexts.
It's meant as both humorous and a nerd snipe :)
it uses absolute difference epsilon equality ('close enough to be considered equal'):
This is wrong code. It only works somewhat correctly when a and b around 1.
Yeah, should be scaled like |x - y| <= ε * max(|x|, |y|)
Will do.
If both terms are infinites and of same sign, subtraction will give NaN and it will fail.
How is that a problem? infinities shouldn't be considered equal
For IEEE 754, and in Java for example, they are. Only NaN is not equal to itself (and different from itself).
static int pretty_float_equal (float a, float b) { return fabsf(a - b) < FLT_EPSILON; }
The code asumes that C17 has C++-style auto (https://github.com/aartaka/pretty.c/blob/master/pretty.h#L11...), it does not (in C auto is storage specifier that is equivalent to no storage specifier).
C17 doesn't have auto, but C23 does, and thus my `__STDC_VERSION__ > 201710L` (notice the greater than sign, it doesn't include C17 itself.)
Ha, apparently the N2310 working draft is not the last one :)
For each looks convoluted, you shouldn’t have to list the type.
It should be no harder than C#’s foreach(var i in list)
Indeed, I might need to revisit foreach with type inference. Should be totally possible.
Sorry for what is probably a stupid question. Does pretty.c act as a preprocessor or sorts, converting your pretty.c script into actual c, that is then compiled? Or is it a virtual machine that interprets your pretty.c script?
It's a set of C Preprocessor macros, so you don't even need to somehow process the code—you just #include the header and hack away as if it was regular C!
Preprocessor of sorts. From the readme:
The goals for Pretty C are:
- Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.
- Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!
- Including only one header (yes, Pretty C is a header-only library #include-able from arbitrary C file!) to turn any codebase into a beginner friendly one.
It’s not a preprocessor or compiler. There’s no binary. It’s just a bunch of C code in a header: macros, functions, etc. that you can use to write different looking programs but that are still C.
This reminded me of ArnoldC
https://lhartikk.github.io/ArnoldC/
I've seen this implementation of defer a few times now. I really dislike calling this defer (like the keyword in Go) as the given code won't be executed on return.
Scoping defer to a block is actually more useful and explicit than function-exclusive that Go does, so I consider that a feature.
No no, I think you misunderstood my critic. Defer working on block-scope is fine; however, if I exit the block through a return (or break), the deferred function is not called.
To my knowledge, you need a compiler extension to implement this in C (or use a C++ destructor).
I’m waiting for someone to write a lambda calculus based C++ library that allows everything to be defined in terms of function. Peano axioms and all.
I feel like this would have been cool 25 years ago
Given the title, shouldn’t that be #include "pretty.c" instead of #include "pretty.h"?
That is pretty cool
> The goals for Pretty C are:
> Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.
:-D
As someone who just got diagnosed with type-1 diabetes (the auto-immune variety, not the “you eat too much sugar” variety), this was far more depressing than funny. I’m probably being overly-sensitive, but man my life has gone to shit in the last couple of years…
Uff, I’m so sorry to hear! I can see how that joke could be really not funny for someone that has gone through some crappy stuff.
I hope you stay strong as you work through such a rough life-changing diagnosis.
Sorry to hear.
Honest advice: I have heard fasting + healthy diet can permanently fix diabetes (though I forgot which type)
Maybe try it? Good luck in any case!
There’s nothing wrong with my diet, that’s type-2. Type-1 is an auto-immune disease, and in my case was triggered by a couple of years of immense stress. Cortisol and the like wreak havoc over the long term. Mine isn’t curable, I don’t have many insulin-producing cells left in my pancreas.
All because two years ago, a hospital tried to make a little bit more money by turning beds faster, and fed sodium into my wife about 2x the recommended rate. Although awake, she has never recovered from the coma, and of course she has brain damage. It’s not like in the movies where you just wake up.
I’m not going into details, but when you scream yourself hoarse from the agony in your limbs, and there’s no painkiller that works, and eventually you lose your voice but you’re still screaming, silently… Yeah, that’s pretty terrible.
Neither of us will recover. I, at least, can manage it with insulin injections/pumps/CGM-patches… She is currently in a “mental health facility”, something she’s had happen to her several times this year. It looks pretty on the outside and resembles a prison once you step beyond the secured entry. It’s where hope goes to die.
All because a “hospital” wanted to make even more money than normal, and didn’t give a shit about their actual patients. Yes, I’m bitter.
I appreciate the sentiment behind your words, thank you for that. It won’t help, but still.
That sounds rough man.
I don't know what to say except to wish you and your wife utmost strength.
Only thing I personally find solace in is that I am certain there is a life after this one in which everything will be amended by the One God Himself.
> Deprecate Lua, Python, JavaScript, Ruby and a dozen other languages, because Pretty C is the ultimate scripting language, but lightning-fast and strongly typed!
Umm… that’s quite the goal.
I’ll stick with deprecated Python.
Deprecate Python?! You'll have to deprecate me first! ;-)
We will all reach this status at some point, though the silly code we produced might stick a bit longer in some legacy codebase. :D
I cannot wait to show this to a colleague of mine. He will kill me XD
Can't wait to learn of how it went!
> max and min of two numbers.
Influenced by windows.h I see :)
as someone that just started C, it looks pretty :)
Thanks!
Are variadic macros turning complete?
Yes. https://github.com/rofl0r/chaos-pp
This is as horrific as it is wonderful.
Thanks!
love. It.
Thanks!
does this transpile to C or how does it actually work?
it's just aliases
preprocessor is using those #defines to replace tokens in the code so it ends up as usual C
Or just use Ruby.
someone's salty about tiobe!
So sweet :))
Yeah, I love sugar ;)
Can we just pascal?
I won't stop you, so be my guest.
Meta: the naming is ... strange.
The actual name of the repo is "pretty.c", but the name used for the language/dialect/result/horrorshow[*] is "Pretty C".
The actual code file you include is called "pretty.h", which makes sense since it's a header, of course. Confusing!
Edit: escapes.
[*] Yes, I'm a C programmer, currently hunting for black-marked insulin to combat the rapid-onset diabetic attack from all that sugar. Sorta.
I mean, don’t say the repo didn’t warn you!
> The goals for Pretty C are: Provide so much syntactic sugar as to cause any C developer a diabetes-induced heart attack.