Author Topic: Dynamic arrays & fixed arrays  (Read 12623 times)

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Dynamic arrays & fixed arrays
« on: October 28, 2018, 10:07:49 PM »
In my proposal about "real arrays" I gave the following proposal:

Code: [Select]
type struct
{
   pointer_size length;
   some_type *start;
}

It might be better putting the pointer first, since that way we can convert to the pointer with a simple cast:

Code: [Select]
type struct
{
   some_type *start;
   pointer_size length;
}

I wrote that "All arrays are always passed by reference", meaning that the struct is actually passed by value, but since it is a fat pointer, it actually becomes pass by ref. So this makes sense for fixed arrays, but for dynamic arrays that breaks down.

Our dynamic array should look like this as a struct:

Code: [Select]
type struct
{
   some_type *start;
   pointer_size length;
   pointer_size allocated;
   Allocator *allocator;
}


But for this we need to pass the whole array by value!

Unfortunately, we can't then make slices in the same manner as with the fixed vector, a slice to a dynamic vector needs to look different from a slice to a normal vector. This is rather vexing since we would have preferred to let them convert into each other. Since the first two fields are structurally the same for the fixed array and the dynamic one, they could transform into each other.

This is safe:

Code: [Select]
func void foo(int[*] y) { ... };
func void bar(int[] x) { ... };

int[*] a = { 1, 2, 3 };
foo(a);
int[] b = cast<int[]>(a);
bar(a); // converts to foo(cast<struct FixedArray>(a))
printf("%d", b[0]);


These are unsafe:
Code: [Select]
func void foo(int[*] y); { ... }
func void bar(int[] x) { ... };

int[*] a = { 1, 2, 3 };
int[] b = cast<int[]>(a);
int[] c = a[0:1];
foo(a);
printf("%d", b[0]); // The pointer in b might have been freed.
printf("%d", c[0]); // The pointer in c might have been freed.

Either we simply accept these weaknesses... we see the errors as similar to the exceptions that occur in say, Java when getting a view of a map or a list that's later updated. OR we try to be clever. I'm somewhat for the less clever solution that might lead to errors when used incorrectly.

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #1 on: November 04, 2018, 02:07:20 PM »
I've been banging my head against this problem now for quite a while.

Basically, if we have a dynamic array, resize it, then all the pointers to it will break. (Obviously). However, this might not be immediately obvious and lead to problems.

Two main possibilities:

  • A dynamic array is a pointer to a struct with a value and a pointer, e.g. i32[...] is struct { i32 size; i32 allocated, i32* arr; }*. It's unimportant that the underlying arr changes, since it always accessed using indirection. This adds a load, but the result can be cached. Copies of the array are aliases of the underlying data.
  • A dynamic array is a struct (same as above but without the pointer). It is mutable, but any copy becomes a shallow copy of the original.

(1) here our struct works as a pointer all of the time. Nothing weird. We do have the double indirection to get to a value in our array, but since dynamic arrays are likely to have mandatory bounds checking, it's not like anyone will use it for heavily optimized situations anyway.
(2) here someone using the array needs to be careful not to incur accidental copies. A way to avoid that would be to say that copies need to be explicitly done, so int[...] y = x is not allowed, only int[...] y = copy(x) is ok.

Furthermore we need to consider lifetimes. It would be nice to allocate the array on the stack, but this causes some subtle issues with setjmp, as we cannot reclaim the underlying pointer when the stack is dropped.

We're clearly in need of some way to safely dealloc the array, but using the stack probably isn't it.


bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #2 on: November 08, 2018, 12:01:46 PM »
I think the same here goes for 'smart' strings, etc. It's always possible to create something like this as a developer.
That way it's just a library thing and not a language thing. Unless there are some real benefits by putting this in
the language itself.

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #3 on: November 08, 2018, 01:32:07 PM »
I think there is a very real advantage in making it a first class object. My proposal is obviously not well fleshed out yet, but it's worth looking at Cyclone and how it added a "fat pointer", which basically is the pointer + the size of how much was allocated in it. That's a very generic way of handling everything that might make sense (after giving it some remodelling)

In Cyclone another pointer type is added: "@". So Foo@ would be a fat pointer to Foo (or an array of Foo), and Foo* would be the raw pointer.

I think there is something to consider here. Still, as I say I do not have a full proposal, just noticing that it would improve a lot of the ergonomics if there was more built in support to handle arrays and pointers. Not just for security reasons.

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #4 on: November 12, 2018, 10:13:01 AM »
There are so many features pending alread, I find it hard to decide on every one. Ideally for a feature, it's optional and can be added to a language at a later stage without
breaking user code. For defer() this seems to be the case. If that's also possible for this case, we could add it later, or just do some proto-typing (without implementation) first.

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #5 on: November 13, 2018, 03:23:30 AM »
Absolutely, I raise these questions early so that there's plenty of time to get a decent proposal and then figure out if it's useful for the language.

And that order is important too:

1. Try to see how a proposal ideally would look like.
2. Using that final proposal – figure out if it's good for C2.

I think something that happened with some proposals for Zig that I made, was basically people threw things out judging from a very rough proposal. And I think that's a bad process of making improvements, because in the process of refining a proposal of a feature X, there might be better ideas emerging that uses some of concepts in X.

I mean, I read up on improvements in Rust, Go, Zig, Jai, Nim, Kit etc. Obviously ideas in there can't be imported into C2 as they are. So it would be a direct "no" for most of the features in those languages. But if you start looking there are parts that could be changed into something useful, or a subset that is simple and direct enough to fit C2.

So anyway, "how would dynamic arrays look like for C2?" and how would fixed arrays look? What sort of compatibility with naked arrays would we get etc. Usually digging into these details will make a lot of ideas go away by themselves :D

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #6 on: November 13, 2018, 09:00:02 AM »
When we put a feature like this in the language, there should be syntactic sugar that is the reward. Otherwise
it doesn't make sense. For example, we could do '+=' on the array to add items, etc. That way the calling code
is easy to read.

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: Dynamic arrays & fixed arrays
« Reply #7 on: November 13, 2018, 10:36:00 AM »
The main motivation for fixed arrays is safety and convenience though.