My pure C implementation of C2 is moving forward slowly (at
https://github.com/lerno/titanos - look at the development branch). It's using LLVM exclusively (no C gen).
I'm focusing on the areas where C2C is weak today, so that would both be LLVM and constant folding.
The compiler uses BigInt for integer constants (and really should do the same for floats, but right now it doesn't). I recently worked on the implicit casts and in order to make sense out of it I've changed a bit from C's implicit casting, mostly following Zig.
To recap C's casting rules:
- with two operands where at least one is float, upgrade everything to the biggest float (float -> double -> long double)
- if no float is found, int conversion resumes.
- If both types are signed or both types are unsigned: promote to the largest of the (eg short -> int -> long etc) (unsigned short -> unsigned int -> unsigned long etc)
- If they have different sign and the signed version can represent represent all numbers of the unsigned (e.g. u16 - i32), promote to the signed version.
- If it's not possible to represent it, promote to the unsigned version of the type. (e.g. u32 - i32 => u32, u64 - i32 => u64)
For constant folding I've followed these rules instead:
- Float conversion as with C
- Folding an operation with bigint and bit limited int will convert the bigint to that bit size. If it's not possible to convert the constant, a compile time error will occur. E.g. const i8 c = 1; const i32 d = c + 200; is illegal since 200 cannot be converted to i8 without loss.
- If two non bigints are folded, it's converted to the biggest type (like in C), for signed / unsigned conversion: if the unsigned constant may be contained in the signed type, then the conversion is valid, otherwise it is a compile time error. (e.g. cast<u8>(200) + cast<i8>(1) is a compile time error)
- In addition, constant overflow is a compile time error. So cast<u8>(200) + cast<u8>(100) would be a compile time error as 300 does not fit in u8.
In order to make unsigned <=> signed conversions I'm thinking of some very simple lossy conversion, like
i8 a = -1;
u8 b = @ucast(a); // Bitcast of a
a = @scast(b); // Bitcast of b
Maybe even add a special assignment like (placeholder syntax!):
b u= a; // same as b = @ucast(a)
if (b u== a) ...; // same as if (b == @ucast(a)) ...
a s= b; // same as a = @scast(a)
if (b s== a) ...; // same as if (@scast(b) == a) ...