Author Topic: LLVM/C gen  (Read 18344 times)

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
LLVM/C gen
« on: November 15, 2018, 03:27:49 AM »
The two backends makes it a bit hard to keep feature parity of both. LLVM is far behind, but what is the strategy?

Using C, a nice thing is that we can start bootstrapping early if we'd like to(!) We can build parts of the server in C2, then compile to C and then automatically copy that code into the main source!

On the other hand, keeping the same behaviour between LLVM and C isn't easy. I've looked at Clang's LLVM gen, and it produces a lot of optimized code by leveraging intrinsics for certain "known" functions. So for example, if Clang sees sqrt, it can swap the normal library version for a LLVM intrinsic. To complicate things further, those are target dependent :( So there are *massive* amounts of work to do – on the LLVM gen.

Obviously if we get more people behind the project then that might be an easier thing to do. Without a lot of people spending time on C2 it will have a hard time being anywhere near optimized.

So what would the plan be?

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: LLVM/C gen
« Reply #1 on: November 15, 2018, 09:24:21 AM »
The 2 back-ends are indeed worlds apart.

When starting with the back-ends, I needed one that was easy to write/debug to test with. That is the C back-end.
The generated C code is readable and can be easily checked.

The LLVM/Ir back-end is only in the initial phase. Generating code for this is more complex than C and also the result
is harder to check. When C2 started, LLVM was at 3.2 (I think). Since then, the API has changed a lot, so the 2nd
goal was to keep the contact layer between C2 and LLVM/Clang to a minimum, to be able to rebase easily. That has
proven to be very nice: most rebases take one-two hours.

In the end, the IR back-end will be the main one and we might even drop the C back-end if it becomes very hard to
map C2 functions to that one. It might be good fun to start integrating the IR back-end more and to really generate
an executable with that (for a small subset of the language at start). C2C currently just generated IR. What is missing
is calling LLVM for optimization passes and then generate binary code and link etc.

Language wise, the back-end is not important. So fleshing out the language itself is the highest priority. But it might be
fun to start doing something (like generating an executable)..

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: LLVM/C gen
« Reply #2 on: November 15, 2018, 03:35:09 PM »
I started to work on IR generation, but found it extremely painful to even implement something as simple as a while-loop DESPITE WORKING DIRECTLY FROM THE IMPLEMENTATION IN THE CLANG SOURCE!

I think it's amazing that people manage to work with LLVM given how extremely poor the documentation is for actually finding what "function X" does. That the LLVM docs and examples are constantly outdated does not help very much.

The best advice I found was someone who wrote "I write some C code and look at what the LLVM IR output is". One webservice to do that is http://ellcc.org/demo/index.cgi

Some overview of the LLVM IR output should be done as well to make sure we use the diagnostics in LLVM.

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: LLVM/C gen
« Reply #3 on: November 27, 2018, 09:37:23 AM »
Yes, I concur, I ran into the same issue. Since C2 and C are close relatives (and the Clang code served as an example in many
occasions), look at the Clang source code. Usually that's more complex than C2 needs, but it should contain the basics.

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: LLVM/C gen
« Reply #4 on: November 27, 2018, 05:24:43 PM »
Should we try to integrate the diagnostics or not?

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: LLVM/C gen
« Reply #5 on: November 29, 2018, 08:18:50 AM »
What do you mean exactly? do you have a code reference for that?

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: LLVM/C gen
« Reply #6 on: November 29, 2018, 02:38:19 PM »
From Clang:

Code: [Select]
void CodeGenFunction::EmitForStmt(const ForStmt &S,
                                  ArrayRef<const Attr *> ForAttrs) {
  JumpDest LoopExit = getJumpDestInCurrentScope("for.end");

  LexicalScope ForScope(*this, S.getSourceRange());

  // Evaluate the first part before the loop.
  if (S.getInit())
    EmitStmt(S.getInit());

  // Start the loop with a block that tests the condition.
  // If there's an increment, the continue scope will be overwritten
  // later.
  JumpDest Continue = getJumpDestInCurrentScope("for.cond");
  llvm::BasicBlock *CondBlock = Continue.getBlock();
  EmitBlock(CondBlock);

  const SourceRange &R = S.getSourceRange();
  LoopStack.push(CondBlock, CGM.getContext(), ForAttrs,
                 SourceLocToDebugLoc(R.getBegin()),
                 SourceLocToDebugLoc(R.getEnd()));

  // If the for loop doesn't have an increment we can just use the
  // condition as the continue block.  Otherwise we'll need to create
  // a block for it (in the current scope, i.e. in the scope of the
  // condition), and that we will become our continue block.
  if (S.getInc())
    Continue = getJumpDestInCurrentScope("for.inc");

  // Store the blocks to use for break and continue.
  BreakContinueStack.push_back(BreakContinue(LoopExit, Continue));

  // Create a cleanup scope for the condition variable cleanups.
  LexicalScope ConditionScope(*this, S.getSourceRange());

  if (S.getCond()) {
    // If the for statement has a condition scope, emit the local variable
    // declaration.
    if (S.getConditionVariable()) {
      EmitAutoVarDecl(*S.getConditionVariable());
    }

    llvm::BasicBlock *ExitBlock = LoopExit.getBlock();
    // If there are any cleanups between here and the loop-exit scope,
    // create a block to stage a loop exit along.
    if (ForScope.requiresCleanups())
      ExitBlock = createBasicBlock("for.cond.cleanup");

    // As long as the condition is true, iterate the loop.
    llvm::BasicBlock *ForBody = createBasicBlock("for.body");

    // C99 6.8.5p2/p4: The first substatement is executed if the expression
    // compares unequal to 0.  The condition must be a scalar type.
    llvm::Value *BoolCondVal = EvaluateExprAsBool(S.getCond());
    Builder.CreateCondBr(
        BoolCondVal, ForBody, ExitBlock,
        createProfileWeightsForLoop(S.getCond(), getProfileCount(S.getBody())));

    if (ExitBlock != LoopExit.getBlock()) {
      EmitBlock(ExitBlock);
      EmitBranchThroughCleanup(LoopExit);
    }

    EmitBlock(ForBody);
  } else {
    // Treat it as a non-zero constant.  Don't even create a new block for the
    // body, just fall into it.
  }
  incrementProfileCounter(&S);

  {
    // Create a separate cleanup scope for the body, in case it is not
    // a compound statement.
    RunCleanupsScope BodyScope(*this);
    EmitStmt(S.getBody());
  }

  // If there is an increment, emit it next.
  if (S.getInc()) {
    EmitBlock(Continue.getBlock());
    EmitStmt(S.getInc());
  }

  BreakContinueStack.pop_back();

  ConditionScope.ForceCleanup();

  EmitStopPoint(&S);
  EmitBranch(CondBlock);

  ForScope.ForceCleanup();

  LoopStack.pop();

  // Emit the fall-through block.
  EmitBlock(LoopExit.getBlock(), true);
}

Here, "EmitStopPoint" is the part that emits debug information. It's in more places as well.

Code: [Select]
void CodeGenFunction::EmitStopPoint(const Stmt *S) {
  if (CGDebugInfo *DI = getDebugInfo()) {
    SourceLocation Loc;
    Loc = S->getLocStart();
    DI->EmitLocation(Builder, Loc);

    LastStopPoint = Loc;
  }
}

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: LLVM/C gen
« Reply #7 on: November 30, 2018, 11:56:07 AM »
It seems like a good idea, but I haven't really focused on that part yet. What do you think?

lerno

  • Full Member
  • ***
  • Posts: 247
    • View Profile
Re: LLVM/C gen
« Reply #8 on: November 30, 2018, 06:32:45 PM »
I don't know. I've looked at IR gen by other languages and it seems they don't have it. It might me that they are not yet heavily invested in debugging. I don't know.

bas

  • Full Member
  • ***
  • Posts: 220
    • View Profile
Re: LLVM/C gen
« Reply #9 on: December 02, 2018, 12:22:59 PM »
Can we just start without and add it when we need it?