前言
在日常工作中,我们经常需要在本地磁盘中搜索特定文件——可能是根据文件名或者文件内容查找文档。Windows自带的搜索功能往往响应缓慢,而第三方工具又可能存在功能冗余或广告干扰的问题。为此,开发了一款轻量高效的文件智能搜索工具,基于Python实现,支持多线程加速、文件类型筛选和关键词高亮等功能,完全满足日常文件检索需求。
代码仓库已经上传到github 可以通过链接直接下载:
https://github.com/ChenAI-TGF/File_Search_Tool

@
- 前言
- 原理简介
- 完整代码
- 效果演示
原理简介
这款工具的核心设计思路是通过多进程并行处理提升搜索效率,同时利用GUI界面简化操作流程,主要涉及以下技术点:
-
界面构建:使用Tkinter库实现图形界面,包含路径选择、关键词输入、文件类型筛选、进度展示和结果显示等模块,确保操作直观简洁。
-
多进程搜索:通过multiprocessing库实现并行搜索,充分利用CPU多核性能。主进程负责界面交互,子进程处理文件扫描和内容匹配,避免单线程搜索导致的界面卡顿。
-
进程间通信:使用Manager创建跨进程共享的队列(Queue)和事件(Event),实现搜索进度更新、结果传递和搜索终止控制。
-
文件类型筛选:通过扩展名匹配实现精准的文件类型过滤,支持预设类别(如音频、视频、文档等)和自定义扩展名添加,仅搜索用户指定类型的文件。
-
关键词匹配:使用正则表达式实现文件名和文件内容的模糊匹配,支持大小写不敏感搜索,并在结果中高亮显示匹配内容。
完整代码
以下是工具的完整实现代码,包含详细注释便于理解和二次开发:
import os
import re
import tkinter as tk
from tkinter import ttk, filedialog, scrolledtext, messagebox, simpledialog
import multiprocessing as mp
from multiprocessing import Pool, Manager
import queue
import threading
from functools import partial
import mimetypes# 定义文件类型分类(可扩展)
FILE_TYPE_CATEGORIES = {"音频文件": ['.mp3', '.wav', '.flac', '.aac', '.ogg', '.wma', '.m4a', '.ape', '.alac'],"视频文件": ['.mp4', '.avi', '.mov', '.mkv', '.flv', '.wmv', '.mpg', '.mpeg', '.rmvb', '.3gp'],"图像文件": ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.tiff', '.webp', '.svg', '.psd', '.ai'],"文档文件": ['.doc', '.docx', '.pdf', '.txt', '.xls', '.xlsx', '.ppt', '.pptx', '.md', '.rtf'],"压缩文件": ['.zip', '.rar', '.7z', '.tar', '.gz', '.bz2'],"程序文件": ['.exe', '.dll', '.py', '.java', '.c', '.cpp', '.js', '.html', '.css']
}class FileSearchApp:def __init__(self, root):self.root = rootself.root.title("文件智能搜索工具")self.root.geometry("1200x700")self.root.minsize(1000, 600)# 设置中文字体self.style = ttk.Style()self.style.configure(".", font=("SimHei", 10))# 搜索参数self.search_path = tk.StringVar()self.keyword = tk.StringVar()self.search_type = tk.StringVar(value="name") # name 或 contentself.is_searching = Falseself.manager = Manager() # 用于创建跨进程共享对象self.stop_event = self.manager.Event() # 跨进程停止事件# 文件类型筛选状态(只搜索选中的类型)self.category_vars = {} # 存储类别勾选状态self.extension_vars = {} # 存储扩展名勾选状态self.included_extensions = set() # 当前选中的要包含的扩展名self.other_files_frame = None # "其他文件"类别的扩展架子框架# 创建UIself.create_widgets()# 初始化文件类型筛选状态(默认不选中任何类型)self.init_file_type_filters()# 进程和队列self.progress_queue = Noneself.result_queue = Noneself.search_process = None# 搜索统计self.total_files = 0self.processed_files = 0self.matched_files = 0def create_widgets(self):# 主框架分割为左右两部分main_paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)main_paned.pack(fill=tk.BOTH, expand=True, padx=5, pady=5)# 左侧:文件类型筛选面板filter_frame = ttk.LabelFrame(main_paned, text="文件类型筛选(只搜索选中类型)", padding="10")main_paned.add(filter_frame, weight=1)# 筛选面板滚动区域self.filter_canvas = tk.Canvas(filter_frame)self.filter_scrollbar = ttk.Scrollbar(filter_frame, orient="vertical", command=self.filter_canvas.yview)self.filter_scrollable_frame = ttk.Frame(self.filter_canvas)self.filter_scrollable_frame.bind("<Configure>",lambda e: self.filter_canvas.configure(scrollregion=self.filter_canvas.bbox("all")))self.filter_canvas.create_window((0, 0), window=self.filter_scrollable_frame, anchor="nw")self.filter_canvas.configure(yscrollcommand=self.filter_scrollbar.set)self.filter_canvas.pack(side="left", fill="both", expand=True)self.filter_scrollbar.pack(side="right", fill="y")# 添加文件类型分类和扩展名复选框for category, extensions in FILE_TYPE_CATEGORIES.items():# 类别总复选框cat_var = tk.BooleanVar()self.category_vars[category] = cat_varcat_check = ttk.Checkbutton(self.filter_scrollable_frame, text=category, variable=cat_var,command=partial(self.toggle_category, category))cat_check.pack(anchor=tk.W, pady=5)# 该类别下的扩展名复选框ext_frame = ttk.Frame(self.filter_scrollable_frame)ext_frame.pack(anchor=tk.W, padx=20)for ext in extensions:ext_var = tk.BooleanVar()self.extension_vars[ext] = ext_varext_check = ttk.Checkbutton(ext_frame, text=ext, variable=ext_var,command=partial(self.update_category_state, category))ext_check.pack(side=tk.LEFT, padx=5, pady=2)# 添加自定义扩展名按钮ttk.Button(filter_frame, text="添加自定义扩展名", command=self.add_custom_extension).pack(pady=10, fill=tk.X)# 右侧:主功能区right_frame = ttk.Frame(main_paned)main_paned.add(right_frame, weight=3)# 顶部框架 - 路径选择path_frame = ttk.Frame(right_frame, padding="10")path_frame.pack(fill=tk.X)ttk.Label(path_frame, text="搜索路径:").pack(side=tk.LEFT, padx=5)ttk.Entry(path_frame, textvariable=self.search_path, width=50).pack(side=tk.LEFT, padx=5, fill=tk.X, expand=True)ttk.Button(path_frame, text="浏览...", command=self.browse_path).pack(side=tk.LEFT, padx=5)# 中间框架 - 搜索设置search_frame = ttk.Frame(right_frame, padding="10")search_frame.pack(fill=tk.X)ttk.Label(search_frame, text="搜索关键词:").pack(side=tk.LEFT, padx=5)ttk.Entry(search_frame, textvariable=self.keyword, width=30).pack(side=tk.LEFT, padx=5)ttk.Radiobutton(search_frame, text="按文件名", variable=self.search_type, value="name").pack(side=tk.LEFT, padx=5)ttk.Radiobutton(search_frame, text="按文件内容", variable=self.search_type, value="content").pack(side=tk.LEFT, padx=5)ttk.Button(search_frame, text="开始搜索", command=self.start_search).pack(side=tk.LEFT, padx=5)ttk.Button(search_frame, text="停止搜索", command=self.stop_search).pack(side=tk.LEFT, padx=5)# 进度框架progress_frame = ttk.Frame(right_frame, padding="10")progress_frame.pack(fill=tk.X)ttk.Label(progress_frame, text="总体进度:").pack(side=tk.LEFT, padx=5)self.overall_progress = ttk.Progressbar(progress_frame, orient="horizontal", length=100, mode="determinate")self.overall_progress.pack(side=tk.LEFT, padx=5, fill=tk.X, expand=True)# 阶段进度框架stage_frame = ttk.Frame(right_frame, padding="10")stage_frame.pack(fill=tk.X)ttk.Label(stage_frame, text="当前阶段:").pack(side=tk.LEFT, padx=5)self.stage_label = ttk.Label(stage_frame, text="准备就绪")self.stage_label.pack(side=tk.LEFT, padx=5)self.stage_progress = ttk.Progressbar(stage_frame, orient="horizontal", length=100, mode="determinate")self.stage_progress.pack(side=tk.LEFT, padx=5, fill=tk.X, expand=True)# 状态框架status_frame = ttk.Frame(right_frame, padding="10")status_frame.pack(fill=tk.X)self.status_label = ttk.Label(status_frame, text="等待开始搜索...")self.status_label.pack(anchor=tk.W)# 结果框架result_frame = ttk.LabelFrame(right_frame, text="搜索结果", padding="10")result_frame.pack(fill=tk.BOTH, expand=True)self.result_display = scrolledtext.ScrolledText(result_frame, wrap=tk.WORD)self.result_display.pack(fill=tk.BOTH, expand=True, pady=5)# 配置不同内容的样式标签self.result_display.tag_configure("filename", foreground="blue", font=("SimHei", 10, "bold"))self.result_display.tag_configure("linenum", foreground="green", font=("SimHei", 10))self.result_display.tag_configure("match", background="yellow")self.result_display.tag_configure("separator", foreground="gray")# 统计信息框架stats_frame = ttk.Frame(right_frame, padding="10")stats_frame.pack(fill=tk.X)self.stats_label = ttk.Label(stats_frame, text="文件总数: 0 | 已处理: 0 | 匹配: 0")self.stats_label.pack(anchor=tk.W)def init_file_type_filters(self):"""初始化文件类型筛选状态(默认不选中任何类型)"""pass # 保持默认不选中def toggle_category(self, category, init=False):"""切换整个类别的选中状态,同步子项"""target_state = self.category_vars[category].get()for ext in FILE_TYPE_CATEGORIES[category]:self.extension_vars[ext].set(target_state)if not init:self.update_included_extensions()def update_category_state(self, category):"""根据子项状态更新类别复选框状态"""extensions = FILE_TYPE_CATEGORIES[category]checked_count = sum(1 for ext in extensions if self.extension_vars[ext].get())# 全选则勾选类别,否则不勾选self.category_vars[category].set(checked_count == len(extensions))self.update_included_extensions()def update_included_extensions(self):"""更新当前选中的要包含的扩展名集合"""self.included_extensions = {ext for ext, var in self.extension_vars.items() if var.get()}def add_custom_extension(self):"""添加自定义扩展名"""ext = simpledialog.askstring("添加自定义扩展名", "请输入扩展名(带点,如 .log):")if not ext:return# 格式化扩展名if not ext.startswith('.'):ext = '.' + extext = ext.lower()# 检查是否已存在if ext in self.extension_vars:messagebox.showinfo("提示", f"扩展名 {ext} 已存在")return# 添加到"其他文件"类别(如果不存在则创建)if "其他文件" not in FILE_TYPE_CATEGORIES:FILE_TYPE_CATEGORIES["其他文件"] = []# 创建类别复选框cat_var = tk.BooleanVar()self.category_vars["其他文件"] = cat_varcat_check = ttk.Checkbutton(self.filter_scrollable_frame,text="其他文件", variable=cat_var,command=partial(self.toggle_category, "其他文件"))cat_check.pack(anchor=tk.W, pady=5)# 创建扩展架子框架self.other_files_frame = ttk.Frame(self.filter_scrollable_frame)self.other_files_frame.pack(anchor=tk.W, padx=20)# 添加到类别列表FILE_TYPE_CATEGORIES["其他文件"].append(ext)# 创建复选框ext_var = tk.BooleanVar()self.extension_vars[ext] = ext_varext_check = ttk.Checkbutton(self.other_files_frame,text=ext, variable=ext_var,command=partial(self.update_category_state, "其他文件"))ext_check.pack(side=tk.LEFT, padx=5, pady=2)messagebox.showinfo("成功", f"已添加自定义扩展名 {ext}")def browse_path(self):path = filedialog.askdirectory()if path:self.search_path.set(path)def start_search(self):if self.is_searching:messagebox.showinfo("提示", "正在搜索中,请先停止当前搜索")returnpath = self.search_path.get()keyword = self.keyword.get()if not path:messagebox.showerror("错误", "请选择搜索路径")returnif not keyword:messagebox.showerror("错误", "请输入搜索关键词")returnif not os.path.exists(path):messagebox.showerror("错误", "所选路径不存在")return# 检查是否选择了文件类型self.update_included_extensions()if not self.included_extensions:messagebox.showerror("错误", "请至少选择一种文件类型")return# 初始化搜索状态self.is_searching = Trueself.stop_event.clear() # 重置停止事件self.result_display.delete(1.0, tk.END)self.total_files = 0self.processed_files = 0self.matched_files = 0self.update_stats()# 创建队列self.progress_queue = self.manager.Queue()self.result_queue = self.manager.Queue()# 启动搜索进程self.search_process = threading.Thread(target=self.perform_search,args=(path, keyword, self.search_type.get()))self.search_process.daemon = Trueself.search_process.start()# 启动进度和结果处理线程self.root.after(100, self.process_queue_updates)def stop_search(self):if self.is_searching and self.search_process:self.stop_event.set()self.status_label.config(text="正在停止搜索...")def perform_search(self, root_path, keyword, search_type):try:# 阶段1: 扫描所有文件并过滤(只保留选中类型)self.progress_queue.put(("stage", "扫描文件并过滤(只保留选中类型)", 0))# 获取所有文件列表all_files = []for dirpath, _, filenames in os.walk(root_path):if self.stop_event.is_set():self.progress_queue.put(("done", "搜索已停止"))returnfor filename in filenames:file_path = os.path.join(dirpath, filename)# 检查文件扩展名是否在选中的类型中ext = os.path.splitext(filename)[1].lower()if ext in self.included_extensions: # 只搜索选中类型all_files.append(file_path)# 更新进度total_scanned = len(all_files)progress = min(100, int((total_scanned / (total_scanned + 1)) * 100)) # 避免除以0self.progress_queue.put(("stage", "扫描文件并过滤(只保留选中类型)", progress))self.total_files = len(all_files)self.progress_queue.put(("total_files", self.total_files))self.progress_queue.put(("stage", "扫描文件并过滤完成", 100))if self.total_files == 0:self.progress_queue.put(("done", "未找到符合条件的文件类型"))return# 阶段2: 搜索文件self.progress_queue.put(("stage", f"正在{('搜索文件名' if search_type == 'name' else '搜索文件内容')}", 0))# 使用多进程搜索num_processes = min(mp.cpu_count(), self.total_files)chunk_size = max(1, self.total_files // num_processes)with Pool(processes=num_processes) as pool:# 部分应用函数参数if search_type == 'name':search_func = partial(search_filename, keyword=keyword, stop_event=self.stop_event,progress_queue=self.progress_queue)else:search_func = partial(search_file_content, keyword=keyword, stop_event=self.stop_event,progress_queue=self.progress_queue)# 异步处理文件列表results = []for i in range(0, self.total_files, chunk_size):chunk = all_files[i:i+chunk_size]results.append(pool.apply_async(search_func, args=(chunk,)))# 收集结果for result in results:matched_in_chunk = result.get()for file_path, matches in matched_in_chunk:self.result_queue.put((file_path, matches))if self.stop_event.is_set():pool.terminate()self.progress_queue.put(("done", "搜索已停止"))returnself.progress_queue.put(("done", "搜索完成"))except Exception as e:self.progress_queue.put(("error", str(e)))def process_queue_updates(self):if not self.is_searching:return# 处理进度更新try:while not self.progress_queue.empty():item = self.progress_queue.get_nowait()if item[0] == "stage":stage_name, progress = item[1], item[2]self.stage_label.config(text=stage_name)self.stage_progress["value"] = progresselif item[0] == "progress":self.processed_files = item[1]overall_progress = (self.processed_files / self.total_files) * 100 if self.total_files > 0 else 0self.overall_progress["value"] = overall_progressself.status_label.config(text=f"正在处理: {item[2]}")self.update_stats()elif item[0] == "total_files":self.total_files = item[1]self.update_stats()elif item[0] == "done":self.status_label.config(text=item[1])self.is_searching = Falseelif item[0] == "error":messagebox.showerror("错误", f"搜索过程中发生错误: {item[1]}")self.is_searching = Falseexcept queue.Empty:pass# 处理结果更新try:while not self.result_queue.empty():item = self.result_queue.get_nowait()file_path, matches = itemself.matched_files += 1self.display_result(file_path, matches)self.update_stats()except queue.Empty:pass# 继续检查队列或结束if self.is_searching:self.root.after(100, self.process_queue_updates)else:self.overall_progress["value"] = 100self.stage_progress["value"] = 100def display_result(self, file_path, matches):# 插入分割线self.result_display.insert(tk.END, "----------------------------\n", "separator")# 显示文件名(蓝色加粗)self.result_display.insert(tk.END, f"文件路径: {file_path}\n", "filename")# 如果有匹配内容,显示行数和匹配片段if matches and self.search_type.get() == "content":self.result_display.insert(tk.END, " 匹配内容:\n")for line_num, line_content in matches[:5]: # 只显示前5个匹配# 显示行数(绿色)self.result_display.insert(tk.END, f" 第{line_num}行: ", "linenum")start_pos = self.result_display.index(tk.END)self.result_display.insert(tk.END, line_content + "\n")end_pos = self.result_display.index(tk.END)# 标记匹配的关键词(黄色高亮)start = start_poskeyword = self.keyword.get()while True:start = self.result_display.search(keyword, start, end_pos, nocase=True)if not start:breakline, col = map(int, start.split('.'))end = f"{line}.{col + len(keyword)}"self.result_display.tag_add("match", start, end)start = endself.result_display.see(tk.END)def update_stats(self):self.stats_label.config(text=f"文件总数: {self.total_files} | 已处理: {self.processed_files} | 匹配: {self.matched_files}")# 多进程辅助函数 - 搜索文件名
def search_filename(files, keyword, stop_event, progress_queue):matched = []pattern = re.compile(re.escape(keyword), re.IGNORECASE)for i, file_path in enumerate(files):if stop_event.is_set():return []filename = os.path.basename(file_path)if pattern.search(filename):matched.append((file_path, []))# 更新进度if i % 10 == 0 or i == len(files) - 1:progress_queue.put(("progress", i + 1, file_path))return matched# 多进程辅助函数 - 搜索文件内容
def search_file_content(files, keyword, stop_event, progress_queue):matched = []pattern = re.compile(re.escape(keyword), re.IGNORECASE)for i, file_path in enumerate(files):if stop_event.is_set():return []try:# 尝试确定文件类型,跳过二进制文件mime_type, _ = mimetypes.guess_type(file_path)if mime_type and mime_type.startswith(('image/', 'audio/', 'video/')):continue# 尝试以文本方式打开文件with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:matches_in_file = []for line_num, line in enumerate(f, 1):if pattern.search(line):# 截取匹配行的上下文start = max(0, line.find(keyword) - 30)end = min(len(line), line.find(keyword) + len(keyword) + 30)snippet = line[start:end].replace('\n', ' ').replace('\r', '')matches_in_file.append((line_num, snippet))# 限制每个文件最多记录10个匹配if len(matches_in_file) >= 10:breakif matches_in_file:matched.append((file_path, matches_in_file))except (IOError, UnicodeDecodeError):# 无法读取的文件跳过pass# 更新进度if i % 10 == 0 or i == len(files) - 1:progress_queue.put(("progress", i + 1, file_path))return matchedif __name__ == "__main__":# 在Windows上运行多进程需要保护主模块if os.name == 'nt':mp.set_start_method('spawn')root = tk.Tk()app = FileSearchApp(root)root.mainloop()
效果演示
工具的界面布局清晰,主要分为以下几个区域:
-
左侧:文件类型筛选区
包含预设的6大类别(音频、视频、图像、文档、压缩、程序文件),每个类别下有对应的扩展名复选框。用户可通过勾选指定需要搜索的文件类型,也可点击"添加自定义扩展名"按钮补充特殊格式(如.log、.json等)。 -
右侧上半部分:搜索控制区
- 路径选择:支持手动输入或通过"浏览"按钮选择搜索根目录
- 关键词输入:支持任意字符串,区分"按文件名"和"按文件内容"两种搜索模式
- 操作按钮:"开始搜索"启动任务,"停止搜索"可随时终止正在进行的搜索
- 进度展示:包含总体进度条(已处理文件占比)和阶段进度条(当前执行步骤),实时显示处理状态
-
右侧下半部分:结果展示区
以结构化方式展示匹配结果:- 文件名以蓝色加粗显示,清晰区分不同文件
- 内容搜索时,绿色标注匹配行号,关键词以黄色高亮显示上下文片段
- 底部统计信息实时更新文件总数、已处理数和匹配数

操作流程示例:
- 在左侧勾选"文档文件"类别(自动选中
.doc、.pdf、.txt等扩展名) - 选择搜索路径(如
D:\工作文档) - 输入关键词"项目计划",选择"按内容搜索"
- 点击"开始搜索",工具先扫描筛选符合条件的文件,再并行搜索内容
- 结果区域实时显示包含"项目计划"的文档路径及匹配位置
