处理文本
块级项目,每次文本超出右边界时都会添加一行.对段落,边界一般是页边距,但如果按列布局页,则也可是列边界,如果表格单元格内有段,则也可是单元格边界.
块级项属性指定其在页上的位置,如缩进项及段落前后间距.内联项属性一般指定显示内容的如字样,字体大小,粗体和斜体等字体.
段落属性
段落有各种指定容器(一般是页)中的位置及按单独行划分内容方式的属性.
可用段落的paragraph_format属性提供的ParagraphFormat对象访问段落格式属性.
水平对齐(对齐)
可用WD_PARAGRAPH_ALIGNMENT枚举中的值设置对齐段落的水平方式为左对齐,居中对齐,右对齐或完全对齐(在左侧和右侧对齐):
from docx.enum.text import WD_ALIGN_PARAGRAPH
document = Document()
paragraph = document.add_paragraph()
paragraph_format = paragraph.paragraph_format
paragraph_format.alignment
None #继承
paragraph_format.alignment = WD_ALIGN_PARAGRAPH.CENTER
paragraph_format.alignment
CENTER (1)
缩进
缩进是段落与容器边的水平间距,一般是页边距.也可在左侧和右侧分别缩进段落.第一行也可与段落的其余部分有不同的缩进,.
from docx.shared import Inches
paragraph = document.add_paragraph()
paragraph_format = paragraph.paragraph_format
paragraph_format.left_indent
None #继承
paragraph_format.left_indent = Inches(0.5)
paragraph_format.left_indent
457200
paragraph_format.left_indent.inches
0.5
右边缩进类似:
from docx.shared import Pt
paragraph_format.right_indent
None
paragraph_format.right_indent = Pt(24)
paragraph_format.right_indent
304800
paragraph_format.right_indent.pt
24.0
使用first_line_indent属性指定首行缩进,并相对左缩进.负值表示悬挂缩进:
paragraph_format.first_line_indent
None
paragraph_format.first_line_indent = Inches(-0.25)
paragraph_format.first_line_indent
-228600
paragraph_format.first_line_indent.inches-0.25
制表符
在使用ParagraphFormat上的tab_stops属性访问的TabStop对象中包含段落或风格的制表符:
tab_stops = paragraph_format.tab_stops
tab_stops
<docx.text.tabstops.TabStops object at 0x106b802d8>
用add_tab_stop()加新制表符:
tab_stop = tab_stops.add_tab_stop(Inches(1.5))
tab_stop.position
1371600
tab_stop.position.inches
1.5
默认左对齐,但可通过提供WD_TAB_ALIGNMENT枚举的成员来指定.空格为默认前导符,但可通过提供WD_TAB_LEADER枚举的成员来指定:
from docx.enum.text import WD_TAB_ALIGNMENT, WD_TAB_LEADER
tab_stop = tab_stops.add_tab_stop(Inches(1.5), WD_TAB_ALIGNMENT.RIGHT, WD_TAB_LEADER.DOTS)
print(tab_stop.alignment)
RIGHT (2)
print(tab_stop.leader)
DOTS (1)
访问现有制表符:
tab_stops[0]
<docx.text.tabstops.TabStop object at 0x1105427e8>
段落间距
space_before和space_after属性分别控制前后段落间间距.
一般用Pt单位:
paragraph_format.space_before, paragraph_format.space_after
(None, None) # inherited by default
paragraph_format.space_before = Pt(18)
paragraph_format.space_before.pt
18.0
paragraph_format.space_after = Pt(12)
paragraph_format.space_after.pt
12.0
行距
由line_spacing和line_spacing_rule属性交互控制行距.line_spacing是长度值,(小浮点数)或无.
长度值表示绝对距离.浮点数表示许多行高."无"表示行距是继承的.line_spacing_rule是WD_LINE_SPACING枚举或None的成员:
from docx.shared import Length
paragraph_format.line_spacing
None
paragraph_format.line_spacing_rule
None
paragraph_format.line_spacing = Pt(18)
isinstance(paragraph_format.line_spacing, Length)
True
paragraph_format.line_spacing.pt
18.0
paragraph_format.line_spacing_rule
EXACTLY (4)
paragraph_format.line_spacing = 1.75
paragraph_format.line_spacing
1.75
paragraph_format.line_spacing_rule多个`(5)`
分页属性
四个段落属性(keep_together,keep_with_next,page_break_before和widow_control控制页边界行为.
1,keep_together,在同一页上显示整个段落,如果段落跨两页,则在段落前发出分页符.
2,keep_with_next,在同一页上保留当前段落与下个段落.如,可在同一页上,保持节标题与节的第一段.
3,page_break_before在新页顶部放段落.用于章节标题.
4,widow_control中断页,避免在单独页上,放置段落的第一行或最后一行.
都是三态的,取值为True,False或None.“无"表示继承.True表示"开”,False表示"关":
paragraph_format.keep_together
None # 继承为默认
paragraph_format.keep_with_next = True
paragraph_format.keep_with_next
True
paragraph_format.page_break_before = False
paragraph_format.page_break_before
False
应用符格式
符格式示例包括字体和大小,粗体,斜体和下划线.
Run对象只读,
这样访问字体:
from docx import Document
document = Document()
run = document.add_paragraph().add_run()
font = run.font
设置:
from docx.shared import Pt
font.name = "Calibri"
font.size = Pt(12)
粗体和斜体是三态属性,全大写,删除线,上标和许多其他属性同样.
font.bold, font.italic
(None, None)
font.italic = True
font.italic
True
font.italic = False
font.italic
False
font.italic = None
font.italic
None
下划线有点特殊.它是三态属性和枚举值属性的混合体.True表示单下划线.False表示没有下划线,但更常见的是,如果不需要下划线,则用None.
其他形式下划线(如双划线或虚线),则使用WD_UNDERLINE枚举指定的成员:
font.underline
None
font.underline = True
# 或
font.underline = WD_UNDERLINE.DOT_DASH
Font color
每个Font对象都有个ColorFormat对象来访问颜色.
用RGB颜色:
from docx.shared import RGBColor
font.color.rgb = RGBColor(0x42, 0x24, 0xE9)
还可按主题颜色赋值MSO_THEME_COLOR_INDEX枚举成员,来设置字体:
from docx.enum.dml import MSO_THEME_COLOR
font.color.theme_color = MSO_THEME_COLOR.ACCENT_1
赋值None给ColorFormat的rgb或theme_color属性,可恢复默认颜色.
font.color.rgb = None
确定字体颜色先要确定颜色类型:
font.color.type
RGB (1)
type属性的值可是MSO_COLOR_TYPE枚举的成员或None.MSO_COLOR_TYPE.RGB表示它是RGB颜色.MSO_COLOR_TYPE.THEME表示为主题颜色.MSO_COLOR_TYPE.AUTO表示自动确定,一般设置为黑色.(此值相对较少).
"无"表示未应用颜色,且按继承,这最常见.
颜色类型为MSO_COLOR_TYPE.RGB时,rgb属性指示是个RGB颜色的RGBColor值:
font.color.rgb
RGBColor(0x42, 0x24, 0xe9)
当颜色类型为MSO_COLOR_TYPE.THEME时,theme_color属性是主题颜色的MSO_THEME_COLOR_INDEX的成员:
font.color.theme_color
ACCENT_1 (5)