LuaJIT 学习(5)—— string.buffer 库

文章目录

    • Using the String Buffer Library
      • Buffer Objects
      • Buffer Method Overview
    • Buffer Creation and Management
      • `local buf = buffer.new([size [,options]]) local buf = buffer.new([options])`
      • `buf = buf:reset()`
      • `buf = buf:free()`
    • Buffer Writers
      • `buf = buf:put([str|num|obj] [,…])`
      • `buf = buf:putf(format, …)`
      • `buf = buf:putcdata(cdata, len)`FFI
      • `buf = buf:set(str) `
      • `buf = buf:set(cdata, len)`FFI
      • `ptr, len = buf:reserve(size)`FFI
      • `buf = buf:commit(used)`FFI
    • Buffer Readers
      • `len = #buf`
      • `res = str|num|buf .. str|num|buf […]`
      • `buf = buf:skip(len)`
      • `str, … = buf:get([len|nil] [,…])`
      • `str = buf:tostring() `
      • `str = tostring(buf)`
      • `ptr, len = buf:ref()`FFI
    • Serialization of Lua Objects
        • 例子:序列化 Lua 对象
    • Error handling
    • FFI caveats
        • 例子说明:

The string buffer library allows high-performance manipulation of string-like data.

Unlike Lua strings, which are constants, string buffers are mutable sequences of 8-bit (binary-transparent) characters. Data can be stored, formatted and encoded into a string buffer and later converted, extracted or decoded.

The convenient string buffer API simplifies common string manipulation tasks, that would otherwise require creating many intermediate strings. String buffers improve performance by eliminating redundant memory copies, object creation, string interning and garbage collection overhead. In conjunction with the FFI library, they allow zero-copy operations.

The string buffer library also includes a high-performance serializer for Lua objects.

Using the String Buffer Library

The string buffer library is built into LuaJIT by default, but it’s not loaded by default. Add this to the start of every Lua file that needs one of its functions:

local buffer = require("string.buffer")

The convention for the syntax shown on this page is that buffer refers to the buffer library and buf refers to an individual buffer object.

Please note the difference between a Lua function call, e.g. buffer.new() (with a dot) and a Lua method call, e.g. buf:reset() (with a colon).

Buffer Objects

A buffer object is a garbage-collected Lua object. After creation with buffer.new(), it can (and should) be reused for many operations. When the last reference to a buffer object is gone, it will eventually be freed by the garbage collector, along with the allocated buffer space.

Buffers operate like a FIFO (first-in first-out) data structure. Data can be appended (written) to the end of the buffer and consumed (read) from the front of the buffer. These operations may be freely mixed.

The buffer space that holds the characters is managed automatically — it grows as needed and already consumed space is recycled. Use buffer.new(size) and buf:free(), if you need more control.

The maximum size of a single buffer is the same as the maximum size of a Lua string, which is slightly below two gigabytes. For huge data sizes, neither strings nor buffers are the right data structure — use the FFI library to directly map memory or files up to the virtual memory limit of your OS.

Buffer Method Overview

  • The buf:put*()-like methods append (write) characters to the end of the buffer.
  • The buf:get*()-like methods consume (read) characters from the front of the buffer.
  • Other methods, like buf:tostring() only read the buffer contents, but don’t change the buffer.
  • The buf:set() method allows zero-copy consumption of a string or an FFI cdata object as a buffer.
  • The FFI-specific methods allow zero-copy read/write-style operations or modifying the buffer contents in-place. Please check the FFI caveats below, too.
  • Methods that don’t need to return anything specific, return the buffer object itself as a convenience. This allows method chaining, e.g.: buf:reset():encode(obj) or buf:skip(len):get()

Buffer Creation and Management

local buf = buffer.new([size [,options]]) local buf = buffer.new([options])

Creates a new buffer object.

The optional size argument ensures a minimum initial buffer size. This is strictly an optimization when the required buffer size is known beforehand. The buffer space will grow as needed, in any case.

The optional table options sets various serialization options.

buf = buf:reset()

Reset (empty) the buffer. The allocated buffer space is not freed and may be reused.

buf = buf:free()

The buffer space of the buffer object is freed. The object itself remains intact, empty and may be reused.

Note: you normally don’t need to use this method. The garbage collector automatically frees the buffer space, when the buffer object is collected. Use this method, if you need to free the associated memory immediately.

Buffer Writers

buf = buf:put([str|num|obj] [,…])

Appends a string str, a number num or any object obj with a __tostring metamethod to the buffer. Multiple arguments are appended in the given order.

Appending a buffer to a buffer is possible and short-circuited internally. But it still involves a copy. Better combine the buffer writes to use a single buffer.

buf = buf:putf(format, …)

Appends the formatted arguments to the buffer. The format string supports the same options as string.format().

buf = buf:putcdata(cdata, len)FFI

Appends the given len number of bytes from the memory pointed to by the FFI cdata object to the buffer. The object needs to be convertible to a (constant) pointer.

buf = buf:set(str)

buf = buf:set(cdata, len)FFI

This method allows zero-copy consumption of a string or an FFI cdata object as a buffer. It stores a reference to the passed string str or the FFI cdata object in the buffer. Any buffer space originally allocated is freed. This is not an append operation, unlike the buf:put*() methods.

After calling this method, the buffer behaves as if buf:free():put(str) or buf:free():put(cdata, len) had been called. However, the data is only referenced and not copied, as long as the buffer is only consumed.

In case the buffer is written to later on, the referenced data is copied and the object reference is removed (copy-on-write semantics).

The stored reference is an anchor for the garbage collector and keeps the originally passed string or FFI cdata object alive.

ptr, len = buf:reserve(size)FFI

buf = buf:commit(used)FFI

The reserve method reserves at least size bytes of write space in the buffer. It returns an uint8_t * FFI cdata pointer ptr that points to this space.

The available length in bytes is returned in len. This is at least size bytes, but may be more to facilitate efficient buffer growth. You can either make use of the additional space or ignore len and only use size bytes.

The commit method appends the used bytes of the previously returned write space to the buffer data.

This pair of methods allows zero-copy use of C read-style APIs:

local MIN_SIZE = 65536
repeatlocal ptr, len = buf:reserve(MIN_SIZE)local n = C.read(fd, ptr, len)if n == 0 then break end -- EOF.if n < 0 then error("read error") endbuf:commit(n)
until false

The reserved write space is not initialized. At least the used bytes must be written to before calling the commit method. There’s no need to call the commit method, if nothing is added to the buffer (e.g. on error).

Buffer Readers

len = #buf

Returns the current length of the buffer data in bytes.

res = str|num|buf .. str|num|buf […]

The Lua concatenation operator .. also accepts buffers, just like strings or numbers. It always returns a string and not a buffer.

Note that although this is supported for convenience, this thwarts one of the main reasons to use buffers, which is to avoid string allocations. Rewrite it with buf:put() and buf:get().

Mixing this with unrelated objects that have a __concat metamethod may not work, since these probably only expect strings.

buf = buf:skip(len)

Skips (consumes) len bytes from the buffer up to the current length of the buffer data.

str, … = buf:get([len|nil] [,…])

Consumes the buffer data and returns one or more strings. If called without arguments, the whole buffer data is consumed. If called with a number, up to len bytes are consumed. A nil argument consumes the remaining buffer space (this only makes sense as the last argument). Multiple arguments consume the buffer data in the given order.

Note: a zero length or no remaining buffer data returns an empty string and not nil.

str = buf:tostring()

str = tostring(buf)

Creates a string from the buffer data, but doesn’t consume it. The buffer remains unchanged.

Buffer objects also define a __tostring metamethod. This means buffers can be passed to the global tostring() function and many other functions that accept this in place of strings. The important internal uses in functions like io.write() are short-circuited to avoid the creation of an intermediate string object.

ptr, len = buf:ref()FFI

Returns an uint8_t * FFI cdata pointer ptr that points to the buffer data. The length of the buffer data in bytes is returned in len.

The returned pointer can be directly passed to C functions that expect a buffer and a length. You can also do bytewise reads (local x = ptr[i]) or writes (ptr[i] = 0x40) of the buffer data.

In conjunction with the skip method, this allows zero-copy use of C write-style APIs:

repeatlocal ptr, len = buf:ref()if len == 0 then break endlocal n = C.write(fd, ptr, len)if n < 0 then error("write error") endbuf:skip(n)
until n >= len

Unlike Lua strings, buffer data is not implicitly zero-terminated. It’s not safe to pass ptr to C functions that expect zero-terminated strings. If you’re not using len, then you’re doing something wrong.

Serialization of Lua Objects

略过

例子:序列化 Lua 对象
local buffer = require("string.buffer")-- 创建一个元表
local mt1 = { __index = function(t, k) return "default" end }
local mt2 = { __index = function(t, k) return "another default" end }-- 创建需要序列化的表
local t1 = setmetatable({ key1 = "value1", key2 = "value2" }, mt1)
local t2 = setmetatable({ key1 = "value3", key2 = "value4" }, mt2)-- 定义字典和元表的数组
local dict = {"key1", "key2"}
local metatable = {mt1, mt2}-- 使用 buffer.new() 进行序列化
local buffer_obj = buffer.new({dict = dict,metatable = metatable
})-- 假设序列化后的数据为序列化函数 `encode()`
local serialized_data = buffer_obj:encode({t1, t2})-- 反序列化
local decoded_data = buffer_obj:decode(serialized_data)-- 访问解码后的数据
for _, tbl in ipairs(decoded_data) doprint(tbl.key1, tbl.key2)
end

Error handling

Many of the buffer methods can throw an error. Out-of-memory or usage errors are best caught with an outer wrapper for larger parts of code. There’s not much one can do after that, anyway.

OTOH, you may want to catch some errors individually. Buffer methods need to receive the buffer object as the first argument. The Lua colon-syntax obj:method() does that implicitly. But to wrap a method with pcall(), the arguments need to be passed like this:

local ok, err = pcall(buf.encode, buf, obj)
if not ok then-- Handle error in err.
end

FFI caveats

The string buffer library has been designed to work well together with the FFI library. But due to the low-level nature of the FFI library, some care needs to be taken:

First, please remember that FFI pointers are zero-indexed. The space returned by buf:reserve() and buf:ref() starts at the returned pointer and ends before len bytes after that.

I.e. the first valid index is ptr[0] and the last valid index is ptr[len-1]. If the returned length is zero, there’s no valid index at all. The returned pointer may even be NULL.

The space pointed to by the returned pointer is only valid as long as the buffer is not modified in any way (neither append, nor consume, nor reset, etc.). The pointer is also not a GC anchor for the buffer object itself.

Buffer data is only guaranteed to be byte-aligned. Casting the returned pointer to a data type with higher alignment may cause unaligned accesses. It depends on the CPU architecture whether this is allowed or not (it’s always OK on x86/x64 and mostly OK on other modern architectures).

FFI pointers or references do not count as GC anchors for an underlying object. E.g. an array allocated with ffi.new() is anchored by buf:set(array, len), but not by buf:set(array+offset, len). The addition of the offset creates a new pointer, even when the offset is zero. In this case, you need to make sure there’s still a reference to the original array as long as its contents are in use by the buffer.

例子说明:
  1. 正常的引用:当你使用 buf:set(array, len) 时,这个 array 是一个通过 FFI 创建的数组,它会被作为 buf 的参数传递进去。在这种情况下,array 被引用,并且只要 buf 依然存在并持有这个引用,array 不会被垃圾回收器回收。这里 array 是一个“垃圾回收锚点”(GC anchor),即它会被垃圾回收器追踪。
  2. 添加偏移量后的情况:当你通过 array + offset 创建一个新的指针时(即通过加偏移量来引用 array 中的某个元素),这时创建的是一个新的指针对象。即使 offset 为零,array + offset 仍然会被视为一个新的指针。这个新的指针不会自动被垃圾回收器追踪,因为它并没有直接引用 array
    • 问题:这意味着,如果你只使用 array + offset(即偏移后的指针),垃圾回收器可能会认为原始的 array 对象不再被使用,最终回收掉 array,即使 buf 仍然依赖于它的内容。这会导致访问已回收的内存,造成未定义行为或崩溃。

Even though each LuaJIT VM instance is single-threaded (but you can create multiple VMs), FFI data structures can be accessed concurrently. Be careful when reading/writing FFI cdata from/to buffers to avoid concurrent accesses or modifications. In particular, the memory referenced by buf:set(cdata, len) must not be modified while buffer readers are working on it. Shared, but read-only memory mappings of files are OK, but only if the file does not change.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/web/72305.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

vue3:request.js中请求方法,api封装请求,方法请求

方法一 request.js // 封装GET请求 export const get (url, params {}) > {return request.get(url, { params }); }; // 封装POST请求 export const post (url, data {}) > {return request.post(url, data); }; api封装 import { post } from /utils/request; …

Ollama+OpenWebUI本地部署大模型

OllamaOpenWebUI本地部署大模型 前言Ollama使用Ollama安装Ollama修改配置Ollama 拉取远程大模型Ollama 构建本地大模型Ollama 运行本地模型&#xff1a;命令行交互Api调用Web 端调用 总结 前言 Ollama是一个开源项目&#xff0c;用于在本地计算机上运行大型语言模型&#xff0…

【机器学习】基于t-SNE的MNIST数据集可视化探索

一、前言 在机器学习和数据科学领域&#xff0c;高维数据的可视化是一个极具挑战但又至关重要的问题。高维数据难以直观地理解和分析&#xff0c;而有效的可视化方法能够帮助我们发现数据中的潜在结构、模式和关系。本文以经典的MNIST手写数字数据集为例&#xff0c;探讨如何利…

【redis】发布订阅

Redis的发布订阅&#xff08;Pub/Sub&#xff09;是一种基于消息多播的通信机制&#xff0c;它允许消息的**发布者&#xff08;Publisher&#xff09;向特定频道发送消息&#xff0c;而订阅者&#xff08;Subscriber&#xff09;**通过订阅频道或模式来接收消息。 其核心特点如…

C语言零基础入门:嵌入式系统开发之旅

C语言零基础入门&#xff1a;嵌入式系统开发之旅 一、引言 嵌入式系统开发是当今科技领域中一个极具魅力和挑战性的方向。从智能家居设备到汽车电子系统&#xff0c;从智能穿戴设备到工业自动化控制&#xff0c;嵌入式系统无处不在。而C语言&#xff0c;作为嵌入式开发中最常…

K8S学习之基础二十三:k8s的持久化存储之nfs

K8S持久化存储之nfs ​ 在 Kubernetes (k8s) 中使用 NFS&#xff08;Network File System&#xff09;作为存储解决方案是一种常见的方式&#xff0c;特别是在需要共享存储的场景中。以下是关于如何在 Kubernetes 中使用 NFS 存储的详细说明&#xff1a; 1. 准备 NFS 服务器 …

【Rust】枚举和模式匹配——Rust语言基础14

文章目录 1. 枚举类型1.2. Option 枚举 2. match 控制流结构2.1. match 对绑定值的匹配2.2. Option<T> 的匹配2.3. 通配模式以及 _ 占位符 3. if let 控制流4. 小测试 1. 枚举类型 枚举&#xff08;enumerations&#xff09;&#xff0c;也被称作 enums。枚举允许你通过…

【商城实战(25)】解锁UniApp移动端适配秘籍,打造完美商城体验

【商城实战】专栏重磅来袭&#xff01;这是一份专为开发者与电商从业者打造的超详细指南。从项目基础搭建&#xff0c;运用 uniapp、Element Plus、SpringBoot 搭建商城框架&#xff0c;到用户、商品、订单等核心模块开发&#xff0c;再到性能优化、安全加固、多端适配&#xf…

《C++ Primer》学习笔记(二)

第二部分&#xff1a;C标准库 1.为了支持不同种类的IO处理操作&#xff0c;标准库定义了以下类型的IO&#xff0c;分别定义在三个独立的文件中&#xff1a;iostream文件中定义了用于读写流的基本类型&#xff1b;fstream文件中定义了读写命名文件的类型&#xff1b;sstream文件…

MATLAB风光柴储微网粒子群算法

本程序实现了风光柴储微网中的粒子群优化&#xff08;PSO&#xff09;算法&#xff0c;用于优化微网的能源调度问题。具体来说&#xff0c;程序考虑了光伏发电、风力发电、柴油机发电&#xff08;柴储&#xff09;&#xff0c;并使用粒子群算法来优化这些能源的调度&#xff0c…

解决Windows版Redis无法远程连接的问题

&#x1f31f; 解决Windows版Redis无法远程连接的问题 在Windows系统下使用Redis时&#xff0c;很多用户会遇到无法远程连接的问题。尤其是在配置了Redis并尝试通过工具如RedisDesktopManager连接时&#xff0c;可能会报错“Cannot connect to ‘redisconnection’”。今天&am…

解决 HTTP 请求中的编码问题:从乱码到正确传输

文章目录 解决 HTTP 请求中的编码问题&#xff1a;从乱码到正确传输1. **问题背景**2. **乱码问题的原因**2.1 **客户端编码问题**2.2 **请求头缺失**2.3 **服务器编码问题** 3. **解决方案**3.1 **明确指定请求体编码**3.2 **确保请求头正确**3.3 **动态获取响应编码** 4. **调…

VS Code 配置优化指南

目录 一、安装与基础设置1. 安装 VS Code2. 中文语言包 二、插件推荐三、常见配置项与优化1. 用户 / 工作区设置2. 全局配置 / Settings Sync3. 常用设置示例 四、性能优化五、调试与终端配置1. 调试配置2. 内置终端配置 六、快捷键配置七、美观与主题八、总结 VS Code&#xf…

基于NXP+FPGA永磁同步电机牵引控制单元(单板结构/机箱结构)

永磁同步电机牵引控制单元&#xff08;单板结构/机箱结构&#xff09; 永磁同步电机牵引控制单元&#xff08;TCU-PMSM&#xff09;用于牵引逆变器-永磁同步电机构成的牵引电传动系统&#xff0c;采用轴控方式。执行高性能永磁同步电机复矢量控制策略&#xff0c;具有响应迅速…

/etc/sysconfig/jenkins 没有这个文件

在 CentOS 或其他基于 Red Hat 的 Linux 系统中&#xff0c;/etc/sysconfig/jenkins 文件通常用来存储 Jenkins 的配置参数&#xff0c;例如 JENKINS_HOME 的路径。但是&#xff0c;如果你发现没有这个文件&#xff0c;你可以通过以下几种方式来解决或确认&#xff1a; 检查 J…

conda 安装软件报错 Found conflicts! Looking for incompatible packages.

问题描述&#xff1a; 利用 conda 安装某包 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc时发现报错&#xff1a; Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with…

MySQL 衍生表(Derived Tables)

在SQL的查询语句select …. from …中&#xff0c;跟在from子句后面的通常是一张拥有定义的实体表&#xff0c;而有的时候我们会用子查询来扮演实体表的角色&#xff0c;这个在from子句中的子查询会返回一个结果集&#xff0c;这个结果集可以像普通的实体表一样查询、连接&…

STM32配套程序接线图

1 工程模板 2 LED闪烁 3LED流水灯 4蜂鸣器 5按键控制LED 6光敏传感器控制蜂鸣器 7OLED显示屏 8对射式红外传感器计次 9旋转编码器计次 10 定时器定时中断 11定时器外部时钟 12PWM驱动LED呼吸灯 13 PWM驱动舵机 14 PWM驱动直流电机 15输入捕获模式测频率 16PWMI模式测频率占空…

鸿蒙初级考试备忘

Module类型 Module按照使用场景可以分为两种类型&#xff1a; Ability类型的Module&#xff1a; 用于实现应用的功能和特性。每一个Ability类型的Module编译后&#xff0c;会生成一个以.hap为后缀的文件&#xff0c;我们称其为HAP&#xff08;Harmony Ability Package&#x…

语音识别踩坑记录

本来想在原来的语音识别的基础上增加本地扩展本地词典&#xff0c; 采用的语音识别是Vosk识别器&#xff0c;模型是 vosk-model-small-cn-0.22 // 初始化Vosk识别器 if (recognizer null) {using (Model model new Model(modelPath)){string grammar "{""…