前言
学习完了(实际上只是粗略过了一遍)FUZZ,接下来学习二进制分析三件套(模糊测试、符号执行、污点分析)中的符号执行。如果说模糊测试是大水漫灌,追求以量取胜,符号执行更像是精准渗透,把每一个分支都尽可能探索出来。
前置知识:
- 二进制分析
- CTF-pwn 相关知识
我还是比较喜欢用ctf的方式来学习的,硬啃知识点我觉得很难受。项目地址: https://github.com/jakespringer/angr_ctf
一、环境安装
我使用AMD64处理器+VM虚拟机进行,二进制一定要用对应架构,mac是arm环境不全(已经不知道踩坑多少次了)。
用VMWare安装Ubuntu24,之后拷贝项目到工作目录
git clone https://github.com/jakespringer/angr_ctf
cd ANGR_CTF
python3 -m venv .venv
source .venv/bin/activate
pip install angr jinja2
二、项目使用
一般来说下载下来的项目有很多类似:
.
├── 00_angr_find
│ ├── 00_angr_find.c.jinja
│ ├── generate.py
│ ├── __init__.py
│ └── scaffold00.py
├── 01_angr_avoid
│ ├── 01_angr_avoid.c.jinja
│ ├── generate.py
│ ├── __init__.py
│ └── scaffold01.py
这里以 00_angr_find 为例讲一下如何使用
00_angr_find.c.jinja题目的源代码,一般来说不用动generate.py制作题目脚本,一般的使用方法是:python generate.py 123 A(123是种子,A是输出名)
#!/usr/bin/env python3
import sys, random, os, tempfile, jinja2def generate(argv):if len(argv) != 3:print('Usage: ./generate.py [seed] [output_file]')sys.exit()seed = argv[1]output_file = argv[2]random.seed(seed)userdef_charset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'userdef = ''.join(random.choice(userdef_charset) for _ in range(8))template = open(os.path.join(os.path.dirname(os.path.realpath(__file__)), '00_angr_find.c.jinja'), 'r').read()t = jinja2.Template(template)c_code = t.render(userdef=userdef, len_userdef=len(userdef), description = '')with tempfile.NamedTemporaryFile(delete=False, suffix='.c', mode='w') as temp:temp.write(c_code)temp.seek(0)os.system('gcc -fno-pie -no-pie -fcf-protection=none -m32 -o ' + output_file + ' ' + temp.name)if __name__ == '__main__':generate(sys.argv)
__init__.py目前还用不上scaffold00.py每一题官方给的题解模板,类似pwn的exp
三、 00_angr_find
3.1 环境准备
source .venv/bin/activate
cd 00_angr_find
python generate.py 123 A
为了方便学习,我使用了固定种子123和名称A,你可以使用任何你喜欢的种子和名称
下载A到主机使用IDA pro打开

发现非常简单,打逆向的师傅一下子就能看出来就是一个加密+比对,但是我们要让程序来执行到puts("Good Job.")
3.2 模板学习
我们先看看官方给的模板,这里我为了学习解释去除了官方注释,请以官方注释为准!
import angr
import sysdef main(argv):path_to_binary = ???project = angr.Project(path_to_binary)initial_state = project.factory.entry_state(add_options={angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS})simulation = project.factory.simgr(initial_state)print_good_address = ???simulation.explore(find=print_good_address)if simulation.found:solution_state = simulation.found[0]print(solution_state.posix.dumps(sys.stdin.fileno()).decode())else:raise Exception('Could not find the solution')if __name__ == '__main__':main(sys.argv)
3.2.1 库分析
第一次我们挨个分析,首先是首尾部分,这里的angr类似pwntools的from pwn import *,以后还会见到的老熟人
import angr
import sys
if __name__ == '__main__':main(sys.argv)
3.2.2 文件对象加载
下一个是文件对象angr.Project,实际上要填入文件路径,类似pwn中的remort或process
path_to_binary = ??? # 示例:"./A"project = angr.Project(path_to_binary)
也就是说,这两句可以简写为:project = angr.Project("./A")
3.2.3 执行器初始化
initial_state = project.factory.entry_state(add_options={angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS})simulation = project.factory.simgr(initial_state)
这两句先不用细究,只要知道:
project.factory.entry_state创造了一个准备执行状态project.factory.simgr(initial_state)创建一个符号执行的状态管理器
其它的选项add_options暂时对本题没影响,可以后面学习,别一上来就全部搞懂,很劝退的,把这两句当成准备了一个执行器即可。
3.2.4 寻找地址
下一步就是寻找目标地址,即你希望程序到达什么地址,然后执行寻找
print_good_address = ??? # 例如:0x80492E3simulation.explore(find=print_good_address)
比方说在本例中我们希望执行到的位置位于0x80492E3,就填入0x80492E3,请注意这里是数值,不是字符串,也不需要用p32()包裹,避免和pwntools混了,虽然有些时候的确需要这两个协同工作

3.2.5 打印结果
if simulation.found:solution_state = simulation.found[0]print(solution_state.posix.dumps(sys.stdin.fileno()).decode())else:raise Exception('Could not find the solution')
这里就是打印对应的结果,具体也不用细究,用就可以了。
3.2.6 完整的脚本
# Before you begin, here are a few notes about these capture-the-flag
# challenges.
#
# Each binary, when run, will ask for a password, which can be entered via stdin
# (typing it into the console.) Many of the levels will accept many different
# passwords. Your goal is to find a single password that works for each binary.
#
# If you enter an incorrect password, the program will print "Try again." If you
# enter a correct password, the program will print "Good Job."
#
# Each challenge will be accompanied by a file like this one, named
# "scaffoldXX.py". It will offer guidance as well as the skeleton of a possible
# solution. You will have to edit each file. In some cases, you will have to
# edit it significantly. While use of these files is recommended, you can write
# a solution without them, if you find that they are too restrictive.
#
# Places in the scaffoldXX.py that require a simple substitution will be marked
# with three question marks (???). Places that require more code will be marked
# with an ellipsis (...). Comments will document any new concepts, but will be
# omitted for concepts that have already been covered (you will need to use
# previous scaffoldXX.py files as a reference to solve the challenges.) If a
# comment documents a part of the code that needs to be changed, it will be
# marked with an exclamation point at the end, on a separate line (!).import angr
import sysdef main(argv):# Create an Angr project.# If you want to be able to point to the binary from the command line, you can# use argv[1] as the parameter. Then, you can run the script from the command# line as follows:# python ./scaffold00.py [binary]# (!)path_to_binary = "./A"project = angr.Project(path_to_binary)# Tell Angr where to start executing (should it start from the main()# function or somewhere else?) For now, use the entry_state function# to instruct Angr to start from the main() function.initial_state = project.factory.entry_state(add_options={angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS})# Create a simulation manager initialized with the starting state. It provides# a number of useful tools to search and execute the binary.simulation = project.factory.simgr(initial_state)# Explore the binary to attempt to find the address that prints "Good Job."# You will have to find the address you want to find and insert it here.# This function will keep executing until it either finds a solution or it# has explored every possible path through the executable.# (!)print_good_address = 0x80492E3simulation.explore(find=print_good_address)# Check that we have found a solution. The simulation.explore() method will# set simulation.found to a list of the states that it could find that reach# the instruction we asked it to search for. Remember, in Python, if a list# is empty, it will be evaluated as false, otherwise true.if simulation.found:# The explore method stops after it finds a single state that arrives at the# target address.solution_state = simulation.found[0]# Print the string that Angr wrote to stdin to follow solution_state. This# is our solution.print(solution_state.posix.dumps(sys.stdin.fileno()).decode())else:# If Angr could not find a path that reaches print_good_address, throw an# error. Perhaps you mistyped the print_good_address?raise Exception('Could not find the solution')if __name__ == '__main__':main(sys.argv)
执行结果
