0` non whitespace character randomly in a single file of the codebase and see how it compiles and runs the tests, then output a entropic fragility measure as a function of `n`, like `CER(2) = [10,68]` which would mean that for `n=2` the compilation fails `10%` of the time while the execution fails `68%` of the time. SBS told me that idea exists and is called `fuzzing`, I will look into that in later posts, here or on my [nascent blog](https://ianxek.github.io/), as I will deem appropriate (I like “deem”, it sounds like I’m an authority). ## SaCzilla’s DNA So how does SaC manage such flabbergasting performance gains compared to state of the art imperative language compiler like gcc ? * Is FP there ? * Side-effects ? * Term rewriting ? * If 3., does it benefit from FP ? * Why is a state of the art compiler for imperative language as STANDARD as C incapable of not looking like it’s not even trying ? So the inner loop is : ```c for (i = 0; i < N; i++ ) { next = first + second%2ll + rep; first = second; second = next - rep; } ``` And SBS explains : > the compiler figures out that `second = second%2ll` This is achieved by leveraging algebraic rules such as associative law, `a+0 = a` and `x - x = 0`. > Once the compiler finds that, the inner loop becomes invariant to the outer one. So there is powerful use of term rewriting, something I personally discovered for the first time (outside dedicated math platforms of course) in the [Pure programming language](https://agraef.github.io/pure-lang/). SBS adds : > Now FP is hitting and the compiler lifts that loop out of the outer _et voilà!_ Isn’t it nice to see when French is cool to substitute for English ? _Eh oui, je suis français._ It turns out that SaC allows us to see the optimised code, in a dialect of SaC that is actually valid SaC ! Use `sac2c -bopt` Let's see the gist of it. Line `506` is the `fibonacci` function in the SaC dialect : ```c /**************************************************************************** * _MAIN::fibonacci(...) [ body ] ****************************************************************************/ longlong _MAIN::fibonacci( longlong f { ,NN } , longlong s { ,NN } , longlong rep { ,NN } , int N { ,NN } ) /* * fibonacci :: --- */ { bool _pinl_136__flat_210 { , NN } ; longlong first__SSA0_1 { , NN } ; _pinl_136__flat_210 = _gt_SxS_( rep, 0LL); first__SSA0_1 = _MAIN::fibonacci__Cond_3( f, _pinl_136__flat_210, N, rep, s, f) ; return( first__SSA0_1); } ``` Here we see the fibonacci function, the optimised version : the compiler, apart from term rewriting, turned the loops into functions, something which another language, namely ATS [encourages to do right in the code itself ](http://ats-lang.sourceforge.net/DOCUMENT/ATS2FUNCRASH/HTML/HTMLTOC/c111.html) : > In imperative programming, a function like testfact is normally implemented in terms of a for-loop > (instead of being defined recursively). While there is indeed direct support for for-loops and > while-loops in ATS, I will not attempt to make use of the support in this book. I hope to make a > convincing argument that making extensive use of recursion is a key to increasing one's > programming producivity. In fact, I think that a functional programmer should develop a reflex for > solving problems recursively. and which we also see in one of the [recommended implementations of the strict Fibonacci in SaC](https://github.com/SacBase/basic-demos/blob/master/simple/fibonacci.sac). So the optimised `fibonacci` actually kept all the variables' names we used. Now that functions declares stuff, checks that `rep` is greater than `0` with `_gt_SxS_` and then passes that check to the function `fibonacci__Cond_3`. So in effect, `fibonacci` is just the `fibonacci__Cond_3`. Let's see what this one does. ```c /**************************************************************************** * Cond function: * _MAIN::fibonacci__Cond_3(...) [ body ] /****************************************************************************/ longlong _MAIN::fibonacci__Cond_3( longlong first { ,NN } , bool _flat_0 { ,NN } , int N { ,NN } , longlong rep { ,NN } , longlong s { ,NN } , longlong f { ,NN } ) /* * fibonacci__Cond_3 :: --- */ { longlong _dlirmov_2401__pinl_137__flat_60 { , NN } ; longlong _dlirmov_2400__pinl_137__flat_60 { , NN } ; longlong _dlirmov_2397_rep { , NN } ; bool _dlirmov_2388__pinl_138__flat_196 { , NN } ; int{0} _dlirmov_2385_i { , NN } ; longlong _dlirmov_2382__pinl_137__flat_60 { , NN } ; bool _dlirmov_2381__pinl_138__flat_196 { , NN } ; int _dlirmov_2380__esd_173 { , NN } ; int{0} _dlirmov_2379_i { , NN } ; longlong first__SSA0_2 { , NN } ; longlong first__SSA0_1 { , NN } ; if (_flat_0) { _dlirmov_2379_i = 0; _dlirmov_2380__esd_173 = _neg_S_( N); _dlirmov_2381__pinl_138__flat_196 = _lt_SxS_( _dlirmov_2380__esd_173, 0); _dlirmov_2397_rep, _dlirmov_2388__pinl_138__flat_196, _dlirmov_2385_i, _dlirmov_2401__pinl_137__flat_60 = _MAIN::_dup_74_fibonacci__Loop_2( _dlirmov_2381__pinl_138__flat_196, _dlirmov_2379_i, rep, s) ; _dlirmov_2382__pinl_137__flat_60 = _add_SxS_( _dlirmov_2397_rep, s); first__SSA0_1 = _MAIN::_dup_2393_fibonacci__Cond_1( f, _dlirmov_2382__pinl_137__flat_60, _dlirmov_2385_i, _dlirmov_2388__pinl_138__flat_196, N) ; } else { } first__SSA0_2 = ( _flat_0 ? first__SSA0_1 : first ); return( first__SSA0_2); } ``` First, we see the check passed as `_flat_0`, then that one also declares a bunch of stuff, then if the check is ok, it calls two functions, `_dup_74_fibonacci__Loop_2` and `_dup_2393_fibonacci__Cond_1`. In between it adds `rep` to `s` in variable number `2382` and passes that to the second function as the second argument. Let's see the first function first. ```c /**************************************************************************** * Loop function with Loop Count -1: * _MAIN::_dup_74_fibonacci__Loop_2(...) [ body ] ****************************************************************************/ longlong, bool, int{0}, longlong _MAIN::_dup_74_fibonacci__Loop_2( bool _pinl_138__flat_196 { ,NN } , int{0} i { ,NN } , longlong rep { ,NN } , longlong s { ,NN } ) /* * _dup_74_fibonacci__Loop_2 :: --- */ { longlong _dlirmov_2399_rep { , NN } ; longlong _dlirmov_2398_rep { , NN } ; bool _dlirmov_2390__pinl_138__flat_196 { , NN } ; bool _dlirmov_2389__pinl_138__flat_196 { , NN } ; int{0} _dlirmov_2387_i { , NN } ; int{0} _dlirmov_2386_i { , NN } ; longlong _dlirmov_2384__pinl_137__flat_60 { , NN } ; longlong _dlirmov_2383__pinl_137__flat_60 { , NN } ; longlong _pinl_137__flat_60 { , NN } ; longlong _pinl_147__flat_152 { , NN } ; bool _pinl_148__flat_210 { , NN } ; _pinl_137__flat_60 = _add_SxS_( rep, s); _pinl_147__flat_152 = _add_SxS_( -1LL, rep); _pinl_148__flat_210 = _gt_SxS_( _pinl_147__flat_152, 0LL); if (_pinl_148__flat_210) { _dlirmov_2398_rep, _dlirmov_2389__pinl_138__flat_196, _dlirmov_2386_i, _dlirmov_2383__pinl_137__flat_60 = _MAIN::_dup_74_fibonacci__Loop_2( _pinl_138__flat_196, i, _pinl_147__flat_152, s) ; } else { } _dlirmov_2399_rep = ( _pinl_148__flat_210 ? _dlirmov_2398_rep : rep ); _dlirmov_2390__pinl_138__flat_196 = ( _pinl_148__flat_210 ? _dlirmov_2389__pinl_138__flat_196 : _pinl_138__flat_196 ); _dlirmov_2387_i = ( _pinl_148__flat_210 ? _dlirmov_2386_i : i ); _dlirmov_2384__pinl_137__flat_60 = ( _pinl_148__flat_210 ? _dlirmov_2383__pinl_137__flat_60 : _pinl_137__flat_60 ); return( _dlirmov_2399_rep, _dlirmov_2390__pinl_138__flat_196, _dlirmov_2387_i, _dlirmov_2384__pinl_137__flat_60); } ``` So here is the computation we exhibited earlier, ie. `_add_SxS_` and `_gt_SxS_` are used to compute `s + rep` and checks if `rep - 1 > 0`, if so it runs `_dup_74_fibonacci__Loop_2` again with a decremented `rep`. Now the second function `_dup_2393_fibonacci__Cond_1` is : ```c /**************************************************************************** * Cond function: * _MAIN::_dup_2393_fibonacci__Cond_1(...) [ body ] ****************************************************************************/ longlong _MAIN::_dup_2393_fibonacci__Cond_1( longlong first { ,NN } , longlong second { ,NN } , int{0} _dlirmov_2385_i { ,NN } , bool _flat_2 { ,NN } , int N { ,NN } ) /* * _dup_2393_fibonacci__Cond_1 :: --- */ { longlong first__SSA0_2 { , NN } ; longlong first__SSA0_1 { , NN } ; if (_flat_2) { first__SSA0_1 = _MAIN::_dup_2392_fibonacci__Loop_0( first, second, _dlirmov_2385_i, N) ; } else { } first__SSA0_2 = ( _flat_2 ? first__SSA0_1 : first ); return( first__SSA0_2); } ``` so we see it calls `_dup_2392_fibonacci__Loop_0` which is : ```c /**************************************************************************** * Loop function with Loop Count -1: * _MAIN::_dup_2392_fibonacci__Loop_0(...) [ body ] ****************************************************************************/ longlong _MAIN::_dup_2392_fibonacci__Loop_0( longlong first { ,NN } , longlong second { ,NN } , int _dlirmov_2385_i { ,NN } , int N { ,NN } ) /* * _dup_2392_fibonacci__Loop_0 :: --- */ { longlong _dl_185 { , NN } ; int _ctz_170 { , NN } ; longlong _pinl_139__flat_128 { , NN } ; int _pinl_144__flat_142 { , NN } ; bool _pinl_145__flat_196 { , NN } ; longlong first__SSA0_3 { , NN } ; longlong first__SSA0_2 { , NN } ; _pinl_139__flat_128 = _mod_SxS_( second, 2LL); _dl_185 = _add_SxS_( _pinl_139__flat_128, first); _pinl_144__flat_142 = _add_SxS_( 1, _dlirmov_2385_i); _ctz_170 = _sub_SxS_( _pinl_144__flat_142, N); _pinl_145__flat_196 = _lt_SxS_( _ctz_170, 0); if (_pinl_145__flat_196) { first__SSA0_2 = _MAIN::_dup_2392_fibonacci__Loop_0( second, _dl_185, _pinl_144__flat_142, N) ; } else { } first__SSA0_3 = ( _pinl_145__flat_196 ? first__SSA0_2 : second ); return( first__SSA0_3); } ``` This is where we see the term rewriting : the `_mod_SxS_`, `_add_SxS_` construct `first + second % 2` and this value is passed to that same function (recursion again) as the parameter named `second`, so we basically construct `second = first + second % 2` and do that `N` times because the `_sub_SxS_` followed by the `_lt_SxS_` checks for `i - N < 0`, so this is basically the inner for loop (`N` times). In the end, we have two loops which instead of one _inside_ the other, are one _after_ the other, so we compute rep + N times, instead of `rep * N` times : it's a logarithmic time reduction effect ! So we go from `O(t)` to `O(log(t))` Reasoning with first order approximation, this is a gain of approximately a factor of `rep` : indeed, since `rep << N` we have `(rep * N)/(rep + N) ~= rep`. This means that the C code compiled with `gcc` should see a running time proportional to `rep` and the SaC code compiled with `sac2c` should take a running time very much independent of rep, because it is the computation controlled by `N` that takes the largest time. Indeed, if we multiply `rep` by 2, we find : ``` gcc -O3 real 0m37.953s user 0m36.650s sys 0m0.218s ``` ``` sac2c -V2 real 0m0.061s user 0m0.027s sys 0m0.031s ``` A bombasting confirmation of my prediction. So let us summarise the key aspects of SaC optimizations : 1. SaC has lifted the inner loop out of the outside one, in effect producing two independent loops ; 2. this lifting arises from the term rewriting that has allowed the compiler to prove that both loops were in fact independent. So the SaC compiler sees that the first `while` computes the useful value of `second` that is to be injected into the `for` loop in order to compute the final value of `second`. ## Term rewriting, FP and lifting So how does SaC do the rewriting ? Let us repeat, because “repetition is the heart of pedagogy” : it uses the axiomatic categorical monadic (whatever that means) facts that `x + 0 = x` and `x-x=0` which applied to the result of the associative rule `(x + y) - y = x + (y - y) = x + 0` gives `x` thus, given y is `rep`, proving that the final value of `second` is independent of `rep`. But once this has been reached, the compiler uses the guaranteed absence of side-effects to effectively lift the inner loop, and basically apply a `log` to the running time ! About “axiomatic categorical monadic” did you know that the demi-god of mathematical physics, Vladimir Arnold -- who came from Russia to work in France -- taught the elite French students and complained that the typical one, when thinking `x * y = y * x` comments that “ah yes, multiplication is _commutative_”. Read Arnold, be enlightened. He’s the epitome of abstraction rooted in humus, or rather melting permafrost. Why doesn't the standard C compiler do the same thing ? > Probably because those don't like algebraic simplifications, SBS told me. True, we are not using fields there, so there are some edge effects that can kick back. But that could also be taken into account by the compiler, making sure that the term rewriting is allowed ; SBS again : > C compilers typically refrain from applying associative law, as, strictly speaking, this does not > hold on any of the numerical types the hardware supports due to possible overflow/ underflow > situations… Whether a C compiler is able to lift out this kind of rather simple loop? Not sure, I would > hope that it would detect the absence of side effects in this case…. But given the FP nature of SaC, it doesn't need to _look_ for the absence of side-effects, because... there are none ! More precisely, but we’ll get to that in later posts, SaC clearly singles out the side-effective parts, so for SaC side-effects are easy to find exceptions, to be marked as such so as to warn the optimizer, and not the normal situations, like in C, where you have to look for the _absence_ of side-effects, or equivalently for the _absence_ of the word "monad" in an typical FP conversation. By the way, why didn't I see that word in SaC nor for that matter in ATS, both FP languages ? :thinking: ## After sac2c optimised code is then fed to gcc But let's not forget that once this optimization is done, it feeds the rewritten code to `gcc` itself, which can now apply its own optimizations. We can indeed see that when using `sac2c` it produces a `.c` file which is quite impressive : it took our 30 LOC original program and output a... `1792` LOC program ! But of course there is a quasi fixed cost, i.e. a quasi fixed overhead : for me, it's like the difference between starting at the gunshot with your legs vs. someone starting a bit later with a dragster, i.e. you need space for the dragster to show its strength, and that is what `rep` gave to SaC. ## Conclusion Try SaC, build the [community](https://www.sac-home.org/slack), get rid of Matlab and stop eyeing its juiced girlfriend Julia. Also read the next posts. ## Exercise > Estimate how many tons of CO2 was produced in the world while you were reading, or more realistically glazing over, that post.