Lua source code series:
Chapter 6 of “Lua Source Code Appreciation” covers content related to Lua coroutines.
Contents
Lua Objects
From the C level, Lua’s state is a lua_State. Within the same Lua VM, multiple lua_States share one global_State. The lua_State should not be viewed as a simple static data structure, but as a state machine in a Lua “thread”, containing information about the current “thread’s” execution state, data stack, call stack, etc.
Lua Data Stack
Lua data can be divided into value types and reference types. The union Value in lstate.h is used to represent them.
union Value {
GCObject *gc; /* collectable objects */
void *p; /* light userdata */
int b; /* booleans */
lua_CFunction f; /* light C functions */
numfield /* numbers */
};
As you can see, reference types use a GCObject pointer for indirect reference, while other value types are stored directly in the union. To distinguish the types in the union, an additional type field is also needed.
In Lua, the data stack size is 2 * LUA_MINSTACK, while in Lua’s C calls, the data stack size is only LUA_MINSTACK. (LUA_MINSTACK defaults to 20)
So when making Lua C calls, you need to explicitly extend the stack size using the luaD_growstack function in ldo.c. Each call extends by at least double the size. When extending the data stack, value type data can be copied directly, while reference type data needs to call correctstack for correction.
Call Stack
Lua’s call stack is in a CallInfo structure, stored as a doubly linked list in the “thread” object.
# define next_ci (L) (L->ci = (L->ci -> next ? L->ci -> next : luaE_extendCI (L)))
As you can see, in the Lua 5.2 implementation, the call stack is encapsulated as an infinitely extensible stack, with unused linked list nodes only cleaned up during GC.
Thread Execution and Interruption
Lua, as an embedded language, uses C’s longjmp mechanism uniformly for interruption and exception handling to implement coroutines not bound to hardware. When embedded in C++, it uses the try / catch mechanism instead. These are switched through macros.
Function Calls
Lua’s pcall is implemented as a function rather than a language mechanism. Implementing pcall involves saving and restoring state at the C level stack.
In pure Lua function calls, C functions are generally not involved. The process is:
Generate a new CallInfo, adjust the data stack, then jump the bytecode execution position to the beginning of the called function. Lua’s return operation does the opposite - restore the data stack, pop the CallInfo, modify the bytecode execution position, and restore to the original execution sequence.
In Lua’s underlying API, there are luaD_precall and luaD_poscall, because:
- In Lua calls,
luaD_precallneeds to be executed first to specify the bytecode execution position, then bytecode is executed inluaD_poscalland state is restored. - In C calls, there’s no need to execute bytecode to restore state, only
luaD_precallneeds to be executed.
C Tricks
TValue uses the NaN Trick to save memory (Lua 5.2 feature). According to IEEE754, when exponent bits are all 1 and mantissa bits are all 0 (0xfff8000000000000), it represents “not a number”. It’s used to represent infinity and the result of division by 0. When a double-precision floating point number is greater than 0xfff8000000000000, it can be considered a deliberately crafted number for special purposes.
NaN Trick exploits this situation. On 32-bit machines, the mantissa bits (52 bits) of a double-precision floating point can be used to store types other than numbers (32 bits) and value types (int), which is sufficient. This saves 4 bytes of memory per value.
Reference
- IEEE_754
- float uses IEEE R32.24, double uses IEEE R64.53
float
double
https://github.com/xiaocang/lua-5.2.2_with_comments/releases/tag/data_n_call_stack