we all love it when our favorite language gets new features with each release, especially does that makes our coding experience easier and more productive. and as we know it is not easy to add these features, especially into programming languages, it requires you to touch the compiler and do some crazy smart things, it is hard.
but what if we can add new stuff, leveraging existing compiler magic, without actually touching it. tada, that’s what Code Lowering is.
so let’s put a simple definition.
What is Code Lowering?
code lowering is the action of taking high-level code syntax like foreach, and lambda expressions and transforming it into a low-level statements like gotos & conditions.
often when searching for code lowering you will come across another term “Syntactic Sugar” and they both mean the same things.
throughout this article, we will use C# as our main reference for examples, because the Roslyn compiler uses Lowering extensively, other languages also use code lowering, so you can reference their documentation for similar examples.
in-addition, we will be using SharpLab.io to see the generated C# code after lowering, it is a great tool you should definitely give it a try.
so let’s dive deep and discover what going on under the cover.
How does Code Lowering work?
the oldest example that I know is related to the yield keyword, check out this code:
public static IEnumerable<int> GetYield()
{
yield return 1;
}
4 lines of code, not that much, very simple, isn’t it? now let’s remove the makeup, paste the code in Sharplab.io (SharpLab – Example 1),
lowering result (click to expand)
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Security;
using System.Security.Permissions;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.Default | DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints | DebuggableAttribute.DebuggingModes.EnableEditAndContinue | DebuggableAttribute.DebuggingModes.DisableOptimizations)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
public static class C
{
[CompilerGenerated]
private sealed class <GetYield>d__0 : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable
{
private int <>1__state;
private int <>2__current;
private int <>l__initialThreadId;
int IEnumerator<int>.Current
{
[DebuggerHidden]
get
{
return <>2__current;
}
}
object IEnumerator.Current
{
[DebuggerHidden]
get
{
return <>2__current;
}
}
[DebuggerHidden]
public <GetYield>d__0(int <>1__state)
{
this.<>1__state = <>1__state;
<>l__initialThreadId = Environment.CurrentManagedThreadId;
}
[DebuggerHidden]
void IDisposable.Dispose()
{
}
private bool MoveNext()
{
int num = <>1__state;
if (num != 0)
{
if (num != 1)
{
return false;
}
<>1__state = -1;
return false;
}
<>1__state = -1;
<>2__current = 1;
<>1__state = 1;
return true;
}
bool IEnumerator.MoveNext()
{
//ILSpy generated this explicit interface implementation from .override directive in MoveNext
return this.MoveNext();
}
[DebuggerHidden]
void IEnumerator.Reset()
{
throw new NotSupportedException();
}
[DebuggerHidden]
IEnumerator<int> IEnumerable<int>.GetEnumerator()
{
if (<>1__state == -2 && <>l__initialThreadId == Environment.CurrentManagedThreadId)
{
<>1__state = 0;
return this;
}
return new <GetYield>d__0(0);
}
[DebuggerHidden]
IEnumerator IEnumerable.GetEnumerator()
{
return ((IEnumerable<int>)this).GetEnumerator();
}
}
[IteratorStateMachine(typeof(<GetYield>d__0))]
public static IEnumerable<int> GetYield()
{
return new <GetYield>d__0(-2);
}
}
that a lot of code, and all of it is auto generated for us, just to hide the magic of the yield keyword, so let’s see what happend.
as we know yield keyword is used to return elements from a function one at a time, and that means we have an iterator, thats why we mark the return type of the function as IEnumerable<>. this means the thing that we return from the function is an objects that implements the iterator pattern.
looking at the code we can see that the compiler has generated a class named <GetYield>d__0, and it implement IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator.
and inside the class we can see the actual implementation of the iterator pattern with the state preservation and management, and is actual very straightforward.
one thing i wants you to notice is the naming of the classes, and fields generated by the compiler all begins with <>, and that to distinct our code from the compiler code, you will find this naming convention in every generated code.
this example give is a clear idea on what lowering is, as you can see we have introduced the yield keyword as a feature, but what the compiler is actually dealing with is normal primitive c# code.
another example for lowering is when you write using statements with Disposable objects
using (var fileStream = new FileStream("c:\\test.txt", FileMode.Open))
{
// code ...
}
lowering result (click to expand)
using System;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Security;
using System.Security.Permissions;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.Default | DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints | DebuggableAttribute.DebuggingModes.EnableEditAndContinue | DebuggableAttribute.DebuggingModes.DisableOptimizations)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
public class C
{
public void M()
{
FileStream fileStream = new FileStream("c:\test.txt", FileMode.Open);
try
{
}
finally
{
if (fileStream != null)
{
((IDisposable)fileStream).Dispose();
}
}
}
}
what happened here is that the using statement has been converted to try/finally block and Dispose() has been called in the finally block. so that instead of writing all of this code, we just do using statements.
also our lovely foreach statement goes into lowering,
var items = new List<int>() {1, 2, 3};
foreach(var item in items)
{
Console.WriteLine(item);
}
lowering result (click to expand)
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Security;
using System.Security.Permissions;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.Default | DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints | DebuggableAttribute.DebuggingModes.EnableEditAndContinue | DebuggableAttribute.DebuggingModes.DisableOptimizations)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
public class C
{
public void M()
{
List<int> list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
List<int>.Enumerator enumerator = list.GetEnumerator();
try
{
while (enumerator.MoveNext())
{
int current = enumerator.Current;
Console.WriteLine(current);
}
}
finally
{
((IDisposable)enumerator).Dispose();
}
}
}
it is lowered into a try/finally block with a while loop using the iterator pattern.
another example is the new record type with C# 10, you may be thinking is a new type like struct and class, but it is not, is just a class with special logic around it, try to put an example in SharpLab and see the lowered code.
and there is even more, as i said Roslyn compiler uses this technique heavily. if you take look at the source code of the compiler under ‘roslyn/src/Compilers/CSharp/Portable/Lowering/’ on Github, you will find the folowing folders:
- AsyncRewriter: this folder contains the lowering logic for the async/await statement.
- ClosureConversion: this one contains the lowering logic for lambda expressions.
- IteratorRewriter: this one related to the iterator pattern as we seen with the yield keyword.
- LocalRewriter: this folder contains the lowering logic for multiple things:
- Delegate creation
- Events
- ForEach loop
- ‘Is’ operator
- ‘lock’ statement
- ’??’ null-coalescing operator
- ‘switch’ statement
- and many, many more
when I first discovered this technique, I was both shocked and amazed. it opens endless possibilities where we can enforce the language with new features in a very simple way. great job.
Is it bad?
ok, we’ve seen how amazing this is. and you may be saying all this auto-generated code that I don’t control, how good and performant it will be. should I be concerned? and the answer is no, as long as you know what you doing and what going on under the cover.
the actual generated code is very optimized, and you should not be worried because the compiler always going to find the best way to do his job. but there are cases where you should be careful, especially with lambda expressions. and closures.
here are some references to know more on the topic:
1- a talk by David Wengier in NDC conferences where he goes in details explaining how this workes under the cover:
Lowering in C#: What’s really going on in your code? – David Wengier
2- an article about Lambdas, linqs and closure issue on Jetbrains blog:
Unusual Ways of Boosting Up App Performance. Lambdas and LINQs