深度学习中常见的矩阵变换函数汇总（持续更新...）

1. 转置操作 (Transpose)

概念：将矩阵的行和列互换
应用场景：
- 在卷积神经网络中转换特征图维度
- 矩阵乘法运算前的维度调整
- 数据预处理过程中的特征重排

原始矩阵 A = [[1, 2, 3],     转置后 A^T = [[1, 4],[4, 5, 6]]                  [2, 5],[3, 6]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print("原始矩阵:")
print(matrix)
print("形状:", matrix.shape)# NumPy 实现
print("\n1. NumPy 转置:")
np_transpose = matrix.T
print(np_transpose)
print("形状:", np_transpose.shape)# 或者使用transpose()函数
np_transpose2 = np.transpose(matrix)
print(np_transpose2)# TensorFlow 实现
print("\n2. TensorFlow 转置:")
tf_matrix = tf.constant(matrix)
tf_transpose = tf.transpose(tf_matrix)
print(tf_transpose.numpy())# PyTorch 实现
print("\n3. PyTorch 转置:")
torch_matrix = torch.tensor(matrix)
torch_transpose = torch.transpose(torch_matrix, 0, 1)  # 交换维度0和1
print(torch_transpose)# 或者使用T属性
torch_transpose2 = torch_matrix.T
print(torch_transpose2)# pandas 实现
print("\n4. pandas 转置:")
df = pd.DataFrame(matrix)
pd_transpose = df.T
print(pd_transpose)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 转置 (稀疏矩阵):")
sparse_matrix = sp.csr_matrix(matrix)
scipy_transpose = sparse_matrix.transpose()
print(scipy_transpose.toarray())

2. 矩阵乘法 (Matrix Multiplication)

概念：两个矩阵相乘，要求第一个矩阵的列数等于第二个矩阵的行数
应用场景：
- Transformer 架构中的注意力计算和全连接层
- 线性层的权重和输入相乘
- 嵌入向量与查询向量的相似度计算

矩阵 A = [[1, 2],    矩阵 B = [[5, 6],     A×B = [[19, 22], [3, 4]]              [7, 8]]             [43, 50]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print("矩阵A:")
print(A)
print("矩阵B:")
print(B)# NumPy 实现
print("\n1. NumPy 矩阵乘法:")
np_matmul = np.matmul(A, B)
print(np_matmul)# 使用 @ 运算符 (Python 3.5+)
np_matmul2 = A @ B
print(np_matmul2)# 使用 np.dot
np_dot = np.dot(A, B)
print(np_dot)# TensorFlow 实现
print("\n2. TensorFlow 矩阵乘法:")
tf_A = tf.constant(A)
tf_B = tf.constant(B)
tf_matmul = tf.matmul(tf_A, tf_B)
print(tf_matmul.numpy())# PyTorch 实现
print("\n3. PyTorch 矩阵乘法:")
torch_A = torch.tensor(A)
torch_B = torch.tensor(B)
torch_matmul = torch.matmul(torch_A, torch_B)
print(torch_matmul)# 使用 @ 运算符
torch_matmul2 = torch_A @ torch_B
print(torch_matmul2)# 使用 torch.mm (仅适用于2D矩阵)
torch_mm = torch.mm(torch_A, torch_B)
print(torch_mm)# pandas 实现 (通过NumPy转换)
print("\n4. pandas 矩阵乘法:")
df_A = pd.DataFrame(A)
df_B = pd.DataFrame(B)
pd_matmul = df_A.values @ df_B.values
print(pd_matmul)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 矩阵乘法 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
sparse_B = sp.csr_matrix(B)
scipy_matmul = sparse_A @ sparse_B
print(scipy_matmul.toarray())

3. 逐元素乘法 (Element-wise Multiplication / Hadamard Product)

概念：两个相同形状的矩阵对应位置元素相乘
应用场景：
- 门控机制（如LSTM、GRU中的门控操作）
- 特征选择和加权
- 掩码操作（如注意力掩码）

矩阵 A = [[1, 2],    矩阵 B = [[5, 6],     A⊙B = [[5, 12], [3, 4]]              [7, 8]]            [21, 32]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print("矩阵A:")
print(A)
print("矩阵B:")
print(B)# NumPy 实现
print("\n1. NumPy 逐元素乘法:")
np_element_wise = A * B
print(np_element_wise)# 或者使用 np.multiply
np_multiply = np.multiply(A, B)
print(np_multiply)# TensorFlow 实现
print("\n2. TensorFlow 逐元素乘法:")
tf_A = tf.constant(A)
tf_B = tf.constant(B)
tf_element_wise = tf_A * tf_B
print(tf_element_wise.numpy())# 或者使用 tf.multiply
tf_multiply = tf.multiply(tf_A, tf_B)
print(tf_multiply.numpy())# PyTorch 实现
print("\n3. PyTorch 逐元素乘法:")
torch_A = torch.tensor(A)
torch_B = torch.tensor(B)
torch_element_wise = torch_A * torch_B
print(torch_element_wise)# 或者使用 torch.mul
torch_mul = torch.mul(torch_A, torch_B)
print(torch_mul)# pandas 实现
print("\n4. pandas 逐元素乘法:")
df_A = pd.DataFrame(A)
df_B = pd.DataFrame(B)
pd_element_wise = df_A * df_B
print(pd_element_wise)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 逐元素乘法 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
sparse_B = sp.csr_matrix(B)
scipy_element_wise = sparse_A.multiply(sparse_B)
print(scipy_element_wise.toarray())

4. 矩阵求逆 (Matrix Inversion)

概念：求一个方阵的逆矩阵，满足 A × A⁻¹ = A⁻¹ × A = I（单位矩阵）
应用场景：
- 线性方程组求解
- 数据白化处理
- 协方差矩阵的处理

矩阵 A = [[4, 7],     A⁻¹ = [[-8/19, 7/19],[2, 6]]             [2/19, -4/19]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.linalg as spla# 创建示例方阵 (必须是可逆矩阵)
A = np.array([[4, 7], [2, 6]])
print("原始矩阵:")
print(A)# NumPy 实现
print("\n1. NumPy 矩阵求逆:")
np_inv = np.linalg.inv(A)
print(np_inv)
print("验证 A × A⁻¹ ≈ I:")
print(A @ np_inv)# TensorFlow 实现
print("\n2. TensorFlow 矩阵求逆:")
tf_A = tf.constant(A, dtype=tf.float32)
tf_inv = tf.linalg.inv(tf_A)
print(tf_inv.numpy())# PyTorch 实现
print("\n3. PyTorch 矩阵求逆:")
torch_A = torch.tensor(A, dtype=torch.float32)
torch_inv = torch.inverse(torch_A)
print(torch_inv)# pandas 实现 (通过NumPy)
print("\n4. pandas 矩阵求逆:")
df_A = pd.DataFrame(A)
pd_inv = pd.DataFrame(np.linalg.inv(df_A.values))
print(pd_inv)# SciPy 实现
print("\n5. SciPy 矩阵求逆:")
scipy_inv = spla.inv(A)
print(scipy_inv)

5. 矩阵分解 - SVD (奇异值分解)

概念：将矩阵分解为 A = U × Σ × V^T，其中U和V是正交矩阵，Σ是对角矩阵
应用场景：
- 降维处理（如PCA的实现）
- 推荐系统中的矩阵分解
- 图像压缩和噪声过滤
- 潜在语义分析(LSA)

矩阵 A = [[1, 2], [3, 4], [5, 6]] 分解为:
U = [[-0.2298,  0.8835,  0.4082],[-0.5247,  0.2408, -0.8165],[-0.8196, -0.4019,  0.4082]]Σ = [[9.5255,  0],[0,      0.5144],[0,      0]]V^T = [[-0.6196, -0.7849],[-0.7849,  0.6196]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.linalg as spla# 创建示例矩阵
A = np.array([[1, 2], [3, 4], [5, 6]])
print("原始矩阵:")
print(A)# NumPy 实现
print("\n1. NumPy SVD分解:")
U, s, Vh = np.linalg.svd(A, full_matrices=True)
print("U 矩阵:")
print(U)
print("奇异值:")
print(s)
print("V^T 矩阵:")
print(Vh)# 重构原始矩阵以验证
# 注意：由于s只包含奇异值，需要创建完整的Σ矩阵
S = np.zeros((A.shape[0], A.shape[1]))
S[:len(s), :len(s)] = np.diag(s)
A_reconstructed = U @ S @ Vh
print("重构矩阵:")
print(A_reconstructed)# TensorFlow 实现
print("\n2. TensorFlow SVD分解:")
tf_A = tf.constant(A, dtype=tf.float32)
s_tf, u_tf, v_tf = tf.linalg.svd(tf_A)
print("奇异值:")
print(s_tf.numpy())
print("U 矩阵:")
print(u_tf.numpy())
print("V 矩阵 (注意：这是V不是V^T):")
print(v_tf.numpy())# PyTorch 实现
print("\n3. PyTorch SVD分解:")
torch_A = torch.tensor(A, dtype=torch.float32)
U_torch, s_torch, V_torch = torch.svd(torch_A)
print("U 矩阵:")
print(U_torch)
print("奇异值:")
print(s_torch)
print("V 矩阵 (注意：这是V不是V^T):")
print(V_torch)# pandas 实现 (通过NumPy)
print("\n4. pandas SVD分解:")
df_A = pd.DataFrame(A)
U_pd, s_pd, Vh_pd = np.linalg.svd(df_A.values)
print("通过pandas数据，使用NumPy的SVD实现")# SciPy 实现
print("\n5. SciPy SVD分解:")
U_scipy, s_scipy, Vh_scipy = spla.svd(A)
print("U 矩阵:")
print(U_scipy)
print("奇异值:")
print(s_scipy)
print("V^T 矩阵:")
print(Vh_scipy)

6. 矩阵reshape操作

概念：改变矩阵的形状，同时保持元素总数不变
应用场景：
- 神经网络层间维度变换（如展平操作）
- 批处理数据的维度调整
- 特征图转换（如在CNN和RNN之间）

矩阵 A = [[1, 2, 3],     reshape后 = [[1, 2],[4, 5, 6]]                  [3, 4],[5, 6]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2, 3], [4, 5, 6]])
print("原始矩阵:")
print(A)
print("形状:", A.shape)# NumPy 实现
print("\n1. NumPy reshape:")
np_reshaped = A.reshape(3, 2)
print(np_reshaped)
print("新形状:", np_reshaped.shape)# 使用-1参数自动计算维度
np_reshaped_auto = A.reshape(-1, 2)  # 行数自动计算
print("自动计算行数:")
print(np_reshaped_auto)# TensorFlow 实现
print("\n2. TensorFlow reshape:")
tf_A = tf.constant(A)
tf_reshaped = tf.reshape(tf_A, [3, 2])
print(tf_reshaped.numpy())# PyTorch 实现
print("\n3. PyTorch reshape:")
torch_A = torch.tensor(A)
torch_reshaped = torch_A.reshape(3, 2)
print(torch_reshaped)# 或使用view函数 (内存连续时)
torch_viewed = torch_A.view(3, 2)
print("使用view:")
print(torch_viewed)# pandas 实现 (通过NumPy)
print("\n4. pandas reshape:")
df_A = pd.DataFrame(A)
pd_reshaped = df_A.values.reshape(3, 2)
print(pd_reshaped)# 转回DataFrame
pd_reshaped_df = pd.DataFrame(pd_reshaped)
print(pd_reshaped_df)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy reshape (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
scipy_reshaped = sparse_A.reshape(3, 2)
print(scipy_reshaped.toarray())

7. 矩阵正则化 (Normalization)

概念：对矩阵进行缩放，使其满足特定的范数约束
应用场景：
- 批归一化（Batch Normalization）
- 权重正则化减少过拟合
- 特征缩放提高训练效率
- 梯度裁剪防止梯度爆炸

矩阵 A = [[1, 2], [3, 4]]L2行归一化后 = [[0.4472, 0.8944],[0.6, 0.8]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp
from sklearn.preprocessing import normalize# 创建示例矩阵
A = np.array([[1, 2], [3, 4]])
print("原始矩阵:")
print(A)# NumPy 实现 (L2行归一化)
print("\n1. NumPy 矩阵行归一化:")
# 手动计算行范数并归一化
row_norms = np.sqrt(np.sum(A**2, axis=1, keepdims=True))
np_normalized = A / row_norms
print(np_normalized)# 使用sklearn的normalize函数
np_normalized_sklearn = normalize(A, norm='l2', axis=1)
print("使用sklearn:")
print(np_normalized_sklearn)# TensorFlow 实现
print("\n2. TensorFlow 矩阵归一化:")
tf_A = tf.constant(A, dtype=tf.float32)
# L2范数归一化
tf_normalized = tf.nn.l2_normalize(tf_A, axis=1)
print(tf_normalized.numpy())# 或使用batch normalization (需要4D输入)
tf_batch_norm = tf.keras.layers.BatchNormalization()(tf.reshape(tf_A, [1, 2, 2, 1]))
print("Batch Normalization (要求4D输入):")
print(tf.reshape(tf_batch_norm, [2, 2]).numpy())# PyTorch 实现
print("\n3. PyTorch 矩阵归一化:")
torch_A = torch.tensor(A, dtype=torch.float32)
# L2范数归一化
torch_normalized = torch.nn.functional.normalize(torch_A, p=2, dim=1)
print(torch_normalized)# 或使用batch normalization
batch_norm = torch.nn.BatchNorm1d(2)
torch_batch_norm = batch_norm(torch_A)
print("Batch Normalization:")
print(torch_batch_norm)# pandas 实现 (通过NumPy或sklearn)
print("\n4. pandas 矩阵归一化:")
df_A = pd.DataFrame(A)
# 使用apply和NumPy
pd_normalized = df_A.apply(lambda x: x / np.sqrt(np.sum(x**2)), axis=1)
print(pd_normalized)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 矩阵归一化 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
# 计算行范数
row_norms = sp.linalg.norm(sparse_A, axis=1)
# 创建对角矩阵的逆
row_norms_inv = 1.0 / row_norms
diag_inv = sp.spdiags(row_norms_inv, 0, sparse_A.shape[0], sparse_A.shape[0])
# 矩阵乘法实现归一化
scipy_normalized = diag_inv @ sparse_A
print(scipy_normalized.toarray())

8. 矩阵拼接 (Concatenation)

概念：沿指定轴将多个矩阵连接在一起
应用场景：
- 特征融合（多模态学习）
- 批处理数据合并
- 序列数据拼接（如在RNN中）
- 层级特征组合（如在Skip Connection中）

矩阵 A = [[1, 2],    矩阵 B = [[5, 6], [3, 4]]              [7, 8]]横向拼接 = [[1, 2, 5, 6],[3, 4, 7, 8]]纵向拼接 = [[1, 2],[3, 4],[5, 6],[7, 8]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print("矩阵A:")
print(A)
print("矩阵B:")
print(B)# NumPy 实现
print("\n1. NumPy 矩阵拼接:")
# 横向拼接 (axis=1)
np_hconcat = np.concatenate([A, B], axis=1)
print("横向拼接 (axis=1):")
print(np_hconcat)# 纵向拼接 (axis=0)
np_vconcat = np.concatenate([A, B], axis=0)
print("纵向拼接 (axis=0):")
print(np_vconcat)# 或使用专用函数
np_hstack = np.hstack([A, B])  # 横向
np_vstack = np.vstack([A, B])  # 纵向
print("使用hstack:")
print(np_hstack)
print("使用vstack:")
print(np_vstack)# TensorFlow 实现
print("\n2. TensorFlow 矩阵拼接:")
tf_A = tf.constant(A)
tf_B = tf.constant(B)
# 横向拼接
tf_hconcat = tf.concat([tf_A, tf_B], axis=1)
print("横向拼接 (axis=1):")
print(tf_hconcat.numpy())# 纵向拼接
tf_vconcat = tf.concat([tf_A, tf_B], axis=0)
print("纵向拼接 (axis=0):")
print(tf_vconcat.numpy())# PyTorch 实现
print("\n3. PyTorch 矩阵拼接:")
torch_A = torch.tensor(A)
torch_B = torch.tensor(B)
# 横向拼接
torch_hconcat = torch.cat([torch_A, torch_B], dim=1)
print("横向拼接 (dim=1):")
print(torch_hconcat)# 纵向拼接
torch_vconcat = torch.cat([torch_A, torch_B], dim=0)
print("纵向拼接 (dim=0):")
print(torch_vconcat)# 或使用专用函数
torch_hstack = torch.hstack([torch_A, torch_B])  # 横向
torch_vstack = torch.vstack([torch_A, torch_B])  # 纵向
print("使用hstack:")
print(torch_hstack)
print("使用vstack:")
print(torch_vstack)# pandas 实现
print("\n4. pandas 矩阵拼接:")
df_A = pd.DataFrame(A)
df_B = pd.DataFrame(B)
# 横向拼接
pd_hconcat = pd.concat([df_A, df_B], axis=1)
print("横向拼接 (axis=1):")
print(pd_hconcat)# 纵向拼接
pd_vconcat = pd.concat([df_A, df_B], axis=0)
print("纵向拼接 (axis=0):")
print(pd_vconcat)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 矩阵拼接 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
sparse_B = sp.csr_matrix(B)
# 横向拼接
scipy_hconcat = sp.hstack([sparse_A, sparse_B])
print("横向拼接:")
print(scipy_hconcat.toarray())# 纵向拼接
scipy_vconcat = sp.vstack([sparse_A, sparse_B])
print("纵向拼接:")
print(scipy_vconcat.toarray())

9. 矩阵求和与归约操作

概念：沿指定轴对矩阵进行求和或归约操作
应用场景：
- 池化操作（平均池化、最大池化）
- 注意力权重的归一化
- 批处理统计量的计算
- 损失函数中的归约

矩阵 A = [[1, 2, 3],[4, 5, 6]]按行求和 = [6, 15]
按列求和 = [5, 7, 9]
全局求和 = 21

代码


```python
import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2, 3], [4, 5, 6]])
print("原始矩阵:")
print(A)# NumPy 实现
print("\n1. NumPy 矩阵归约操作:")
# 按行求和 (axis=1)
np_row_sum = np.sum(A, axis=1)
print("按行求和 (axis=1):")
print(np_row_sum)# 按列求和 (axis=0)
np_col_sum = np.sum(A, axis=0)
print("按列求和 (axis=0):")
print(np_col_sum)# 全局求和
np_total_sum = np.sum(A)
print("全局求和:")
print(np_total_sum)# 其他归约操作
np_max = np.max(A, axis=1)  # 按行取最大值
np_min = np.min(A, axis=0)  # 按列取最小值
np_mean = np.mean(A)        # 全局平均值
print("按行最大值:", np_max)
print("按列最小值:", np_min)
print("全局平均值:", np_mean)# TensorFlow 实现
print("\n2. TensorFlow 矩阵归约操作:")
tf_A = tf.constant(A)
# 按行求和
tf_row_sum = tf.reduce_sum(tf_A, axis=1)
print("按行求和 (axis=1):")
print(tf_row_sum.numpy())# 按列求和
tf_col_sum = tf.reduce_sum(tf_A, axis=0)
print("按列求和 (axis=0):")
print(tf_col_sum.numpy())# 全局求和
tf_total_sum = tf.reduce_sum(tf_A)
print("全局求和:")
print(tf_total_sum.numpy())# 其他归约操作
tf_max = tf.reduce_max(tf_A, axis=1)  # 按行取最大值
tf_min = tf.reduce_min(tf_A, axis=0)  # 按列取最小值
tf_mean = tf.reduce_mean(tf_A)        # 全局平均值
print("按行最大值:", tf_max.numpy())
print("按列最小值:", tf_min.numpy())
print("全局平均值:", tf_mean.numpy())# PyTorch 实现
print("\n3. PyTorch 矩阵归约操作:")
torch_A = torch.tensor(A)
# 按行求和
torch_row_sum = torch.sum(torch_A, dim=1)
print("按行求和 (dim=1):")
print(torch_row_sum)# 按列求和
torch_col_sum = torch.sum(torch_A, dim=0)
print("按列求和 (dim=0):")
print(torch_col_sum)# 全局求和
torch_total_sum = torch.sum(torch_A)
print("全局求和:")
print(torch_total_sum)# 其他归约操作
torch_max = torch.max(torch_A, dim=1).values  # 按行取最大值
torch_min = torch.min(torch_A, dim=0).values  # 按列取最小值
torch_mean = torch.mean(torch_A)              # 全局平均值
print("按行最大值:", torch_max)
print("按列最小值:", torch_min)
print("全局平均值:", torch_mean)# pandas 实现
print("\n4. pandas 矩阵归约操作:")
df_A = pd.DataFrame(A)
# 按行求和
pd_row_sum = df_A.sum(axis=1)
print("按行求和 (axis=1):")
print(pd_row_sum)# 按列求和
pd_col_sum = df_A.sum(axis=0)
print("按列求和 (axis=0):")
print(pd_col_sum)# 全局求和
pd_total_sum = df_A.values.sum()
print("全局求和:")
print(pd_total_sum)# 其他归约操作
pd_max = df_A.max(axis=1)  # 按行取最大值
pd_min = df_A.min(axis=0)  # 按列取最小值
pd_mean = df_A.mean().mean()  # 全局平均值
print("按行最大值:", pd_max.values)
print("按列最小值:", pd_min.values)
print("全局平均值:", pd_mean)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 矩阵归约操作 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
# 按行求和
scipy_row_sum = sparse_A.sum(axis=1)
print("按行求和 (axis=1):")
print(scipy_row_sum)# 按列求和
scipy_col_sum = sparse_A.sum(axis=0)
print("按列求和 (axis=0):")
print(scipy_col_sum)# 全局求和
scipy_total_sum = sparse_A.sum()
print("全局求和:")
print(scipy_total_sum)

10. 矩阵广播 (Broadcasting)

概念：自动扩展较小的矩阵以匹配较大矩阵的形状，用于不同形状矩阵间的运算
应用场景：
- 批处理中的权重共享
- 添加偏置项到特征矩阵
- 批量数据缩放
- 注意力机制的掩码应用

矩阵 A = [[1, 2, 3],    向量 b = [10, 20, 30][4, 5, 6]]A + b = [[11, 22, 33],[14, 25, 36]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵和向量
A = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([10, 20, 30])  # 形状为 (3,)
print("矩阵A:")
print(A)
print("向量b:")
print(b)# NumPy 实现
print("\n1. NumPy 广播操作:")
np_broadcast = A + b
print("A + b (广播):")
print(np_broadcast)# 另一个例子 - 列向量广播
c = np.array([[100], [200]])  # 形状为 (2, 1)
np_broadcast2 = A + c
print("A + c (列向量广播):")
print(np_broadcast2)# TensorFlow 实现
print("\n2. TensorFlow 广播操作:")
tf_A = tf.constant(A)
tf_b = tf.constant(b)
tf_broadcast = tf_A + tf_b
print("A + b (广播):")
print(tf_broadcast.numpy())# 另一个例子 - 列向量广播
tf_c = tf.constant([[100], [200]])
tf_broadcast2 = tf_A + tf_c
print("A + c (列向量广播):")
print(tf_broadcast2.numpy())# PyTorch 实现
print("\n3. PyTorch 广播操作:")
torch_A = torch.tensor(A)
torch_b = torch.tensor(b)
torch_broadcast = torch_A + torch_b
print("A + b (广播):")
print(torch_broadcast)# 另一个例子 - 列向量广播
torch_c = torch.tensor([[100], [200]])
torch_broadcast2 = torch_A + torch_c
print("A + c (列向量广播):")
print(torch_broadcast2)# 使用torch.broadcast_to显式广播
torch_b_expanded = torch.broadcast_to(torch_b, (2, 3))
print("显式广播b:")
print(torch_b_expanded)# pandas 实现
print("\n4. pandas 广播操作:")
df_A = pd.DataFrame(A)
# 对Series进行广播
s_b = pd.Series(b)
# 注意：pandas广播规则与NumPy稍有不同
# 使用numpy方式实现
pd_broadcast = df_A + s_b.values
print("使用numpy值进行广播:")
print(pd_broadcast)# 或使用numpy直接操作
pd_broadcast2 = df_A.values + b
print("直接使用numpy操作:")
print(pd_broadcast2)# SciPy 实现
print("\n5. SciPy 广播操作 (与NumPy类似):")
# 对于稀疏矩阵，广播操作不直接支持，需要转换处理
sparse_A = sp.csr_matrix(A)
# 必须先转换为密集矩阵，再进行广播，最后转回稀疏矩阵
scipy_broadcast = sp.csr_matrix(sparse_A.toarray() + b)
print("稀疏矩阵广播 (通过转换):")
print(scipy_broadcast.toarray())

11. 矩阵掩码操作 (Masking)

概念：使用布尔矩阵选择性地保留或修改矩阵中的特定元素
应用场景：
- 注意力掩码（Transformer中的掩码机制）
- 序列填充处理（处理变长序列）
- 条件计算（选择性激活）
- 梯度掩码（控制反向传播）

矩阵 A = [[1, 2, 3],       掩码 M = [[True, False, True],[4, 5, 6]]                [False, True, False]]应用掩码后 = [[1, 0, 3],[0, 5, 0]]

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵和掩码
A = np.array([[1, 2, 3], [4, 5, 6]])
mask = np.array([[True, False, True], [False, True, False]])
print("原始矩阵A:")
print(A)
print("掩码矩阵mask:")
print(mask)# NumPy 实现
print("\n1. NumPy 掩码操作:")
# 直接使用布尔索引
np_masked = A * mask  # 这会把False位置变为0
print("应用掩码 (A * mask):")
print(np_masked)# 设置特定值
np_masked_specific = np.where(mask, A, -1)  # 把False位置设为-1
print("应用掩码 (特定值替换):")
print(np_masked_specific)# 高级掩码 - 基于条件
condition_mask = A > 3
np_conditional = np.where(condition_mask, A, 0)
print("条件掩码 (A > 3):")
print(np_conditional)# TensorFlow 实现
print("\n2. TensorFlow 掩码操作:")
tf_A = tf.constant(A)
tf_mask = tf.constant(mask)
# 直接乘法掩码
tf_masked = tf_A * tf.cast(tf_mask, tf.int32)
print("应用掩码 (A * mask):")
print(tf_masked.numpy())# 使用where掩码
tf_masked_specific = tf.where(tf_mask, tf_A, -1)
print("应用掩码 (特定值替换):")
print(tf_masked_specific.numpy())# 条件掩码
tf_condition_mask = tf_A > 3
tf_conditional = tf.where(tf_condition_mask, tf_A, 0)
print("条件掩码 (A > 3):")
print(tf_conditional.numpy())# PyTorch 实现
print("\n3. PyTorch 掩码操作:")
torch_A = torch.tensor(A)
torch_mask = torch.tensor(mask)
# 直接乘法掩码
torch_masked = torch_A * torch_mask
print("应用掩码 (A * mask):")
print(torch_masked)# 使用where掩码
torch_masked_specific = torch.where(torch_mask, torch_A, torch.tensor(-1))
print("应用掩码 (特定值替换):")
print(torch_masked_specific)# 条件掩码
torch_condition_mask = torch_A > 3
torch_conditional = torch.where(torch_condition_mask, torch_A, torch.tensor(0))
print("条件掩码 (A > 3):")
print(torch_conditional)# 在注意力机制中常用的掩码示例
# 创建一个序列掩码（下三角矩阵）
seq_length = 4
causal_mask = torch.tril(torch.ones(seq_length, seq_length))
print("因果掩码 (下三角矩阵):")
print(causal_mask)# pandas 实现
print("\n4. pandas 掩码操作:")
df_A = pd.DataFrame(A)
pd_mask = pd.DataFrame(mask)
# 使用mask方法
pd_masked = df_A.mask(~pd_mask, 0)  # 注意pandas中mask用法与直觉相反
print("应用掩码 (A.mask(~mask)):")
print(pd_masked)# 使用where方法
pd_masked_specific = df_A.where(pd_mask, -1)
print("应用掩码 (特定值替换):")
print(pd_masked_specific)# 条件掩码
pd_condition_mask = df_A > 3
pd_conditional = df_A.where(pd_condition_mask, 0)
print("条件掩码 (A > 3):")
print(pd_conditional)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 掩码操作 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
# 稀疏矩阵的掩码操作相对复杂，一般通过转换处理
# 创建稀疏掩码
sparse_mask = sp.csr_matrix(mask)
# 元素乘法实现掩码
scipy_masked = sparse_A.multiply(sparse_mask)
print("应用掩码 (稀疏矩阵):")
print(scipy_masked.toarray())

12. 矩阵切片和索引

概念：提取矩阵的子集或特定元素
应用场景：
- 特征选择
- 批处理数据的提取
- 注意力机制中的头部分割
- 隐藏状态的选择性访问

矩阵 A = [[1, 2, 3, 4],[5, 6, 7, 8],[9, 10, 11, 12]]子矩阵 (行1-2, 列1-3) = [[6, 7],[10, 11]]

代码

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
A = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print("原始矩阵:")
print(A)# NumPy 实现
print("\n1. NumPy 切片和索引:")
# 基本切片 [行开始:行结束, 列开始:列结束]
np_slice = A[1:3, 1:3]
print("子矩阵 (行1-2, 列1-3):")
print(np_slice)# 单行/单列提取
np_row = A[1, :]  # 第二行
np_col = A[:, 2]  # 第三列
print("第二行:")
print(np_row)
print("第三列:")
print(np_col)# 高级索引
np_specific = A[[0, 2], [1, 3]]  # 获取 A[0,1] 和 A[2,3]
print("特定元素 (A[0,1] 和 A[2,3]):")
print(np_specific)# 布尔索引
np_bool_idx = A[A > 5]  # 获取所有大于5的元素
print("所有大于5的元素:")
print(np_bool_idx)# TensorFlow 实现
print("\n2. TensorFlow 切片和索引:")
tf_A = tf.constant(A)
# 基本切片
tf_slice = tf_A[1:3, 1:3]
print("子矩阵 (行1-2, 列1-3):")
print(tf_slice.numpy())# 单行/单列提取
tf_row = tf_A[1, :]  # 第二行
tf_col = tf_A[:, 2]  # 第三列
print("第二行:")
print(tf_row.numpy())
print("第三列:")
print(tf_col.numpy())# 高级索引
tf_specific = tf.gather_nd(tf_A, [[0, 1], [2, 3]])
print("特定元素 (A[0,1] 和 A[2,3]):")
print(tf_specific.numpy())# 布尔索引
tf_bool_mask = tf_A > 5
tf_bool_idx = tf.boolean_mask(tf_A, tf_bool_mask)
print("所有大于5的元素:")
print(tf_bool_idx.numpy())# PyTorch 实现
print("\n3. PyTorch 切片和索引:")
torch_A = torch.tensor(A)
# 基本切片
torch_slice = torch_A[1:3, 1:3]
print("子矩阵 (行1-2, 列1-3):")
print(torch_slice)# 单行/单列提取
torch_row = torch_A[1, :]  # 第二行
torch_col = torch_A[:, 2]  # 第三列
print("第二行:")
print(torch_row)
print("第三列:")
print(torch_col)# 高级索引
torch_specific = torch_A[[0, 2], [1, 3]]  # 获取 A[0,1] 和 A[2,3]
print("特定元素 (A[0,1] 和 A[2,3]):")
print(torch_specific)# 布尔索引
torch_bool_mask = torch_A > 5
torch_bool_idx = torch_A[torch_bool_mask]
print("所有大于5的元素:")
print(torch_bool_idx)# pandas 实现
print("\n4. pandas 切片和索引:")
df_A = pd.DataFrame(A)
# 基本切片 - 使用iloc
pd_slice = df_A.iloc[1:3, 1:3]
print("子矩阵 (行1-2, 列1-3):")
print(pd_slice)# 单行/单列提取
pd_row = df_A.iloc[1, :]  # 第二行
pd_col = df_A.iloc[:, 2]  # 第三列
print("第二行:")
print(pd_row)
print("第三列:")
print(pd_col)# 条件索引
pd_bool_idx = df_A[df_A > 5]
print("所有大于5的元素 (保留NaN):")
print(pd_bool_idx)# SciPy 实现 (稀疏矩阵)
print("\n5. SciPy 切片和索引 (稀疏矩阵):")
sparse_A = sp.csr_matrix(A)
# 基本切片
scipy_slice = sparse_A[1:3, 1:3]
print("子矩阵 (行1-2, 列1-3):")
print(scipy_slice.toarray())# 单行/单列提取
scipy_row = sparse_A[1, :].toarray()  # 第二行
scipy_col = sparse_A[:, 2].toarray()  # 第三列
print("第二行:")
print(scipy_row)
print("第三列:")
print(scipy_col)

13. 维度压缩与扩展 (Squeeze/Unsqueeze)

概念：
- squeeze: 移除tensor中所有大小为1的维度
- unsqueeze: 在指定位置添加一个大小为1的维度
应用场景：
- 添加批处理维度进行批量操作
- 在CNN中调整通道维度
- 适配不同网络层之间的输入输出维度
- 使用广播机制进行计算前的维度准备

原始tensor: shape=[2, 1, 3]       squeeze后: shape=[2, 3]
[[[ 1, 2, 3]],                    [[ 1, 2, 3],[[ 4, 5, 6]]]                     [ 4, 5, 6]]原始tensor: shape=[2, 3]          unsqueeze(1)后: shape=[2, 1, 3]
[[ 1, 2, 3],                      [[[ 1, 2, 3]],[ 4, 5, 6]]                       [[ 4, 5, 6]]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd# 创建示例数组/张量
x = np.array([[[1, 2, 3]], [[4, 5, 6]]])  # shape: (2, 1, 3)
print("原始数组:")
print(x)
print("形状:", x.shape)  # (2, 1, 3)# NumPy 实现
print("\n1. NumPy squeeze/expand_dims:")
# squeeze - 移除大小为1的维度
np_squeezed = np.squeeze(x, axis=1)
print("squeeze后:")
print(np_squeezed)
print("形状:", np_squeezed.shape)  # (2, 3)# expand_dims - 添加一个维度 (NumPy中对应unsqueeze)
y = np.array([[1, 2, 3], [4, 5, 6]])  # shape: (2, 3)
np_unsqueezed = np.expand_dims(y, axis=1)
print("\nexpand_dims后:")
print(np_unsqueezed)
print("形状:", np_unsqueezed.shape)  # (2, 1, 3)# TensorFlow 实现
print("\n2. TensorFlow squeeze/expand_dims:")
tf_x = tf.constant(x)
# squeeze
tf_squeezed = tf.squeeze(tf_x, axis=1)
print("squeeze后:")
print(tf_squeezed.numpy())
print("形状:", tf_squeezed.shape)  # (2, 3)# expand_dims
tf_y = tf.constant(y)
tf_unsqueezed = tf.expand_dims(tf_y, axis=1)
print("\nexpand_dims后:")
print(tf_unsqueezed.numpy())
print("形状:", tf_unsqueezed.shape)  # (2, 1, 3)# PyTorch 实现
print("\n3. PyTorch squeeze/unsqueeze:")
torch_x = torch.tensor(x)
# squeeze
torch_squeezed = torch_x.squeeze(1)
print("squeeze后:")
print(torch_squeezed)
print("形状:", torch_squeezed.size())  # torch.Size([2, 3])# unsqueeze
torch_y = torch.tensor(y)
torch_unsqueezed = torch_y.unsqueeze(1)
print("\nunsqueeze后:")
print(torch_unsqueezed)
print("形状:", torch_unsqueezed.size())  # torch.Size([2, 1, 3])# Pandas 没有直接对应的函数
print("\n4. Pandas 没有直接对应的squeeze/unsqueeze函数，需转换为NumPy处理")# SciPy 稀疏矩阵不直接支持维度操作
print("\n5. SciPy 稀疏矩阵不直接支持多维squeeze/unsqueeze，通常需转为密集数组处理")

14. 内存连续性 (Contiguous)

概念：确保张量在内存中连续存储，某些操作(如转置、切片)会导致内存非连续
应用场景：
- 执行需要连续内存的操作前调用
- 视图操作后提高后续计算效率
- CUDA操作优化
- 与要求连续内存的外部库交互

# 视觉表示（简化）:
非连续内存:  [1, 2, 3, 4, 5, 6]  <-- 内存中的实际存储↑  ↑  ↑             但逻辑访问顺序是: 1,4,2,5,3,6|  |  |
连续内存:    [1, 4, 2, 5, 3, 6]  <-- 重排后的内存存储

代码：

import numpy as np
import tensorflow as tf
import torch# 创建示例张量
matrix = np.array([[1, 2, 3], [4, 5, 6]])# PyTorch 实现 (最常用contiguous的框架)
print("1. PyTorch contiguous:")
torch_matrix = torch.tensor(matrix)
print("原始张量是否连续:", torch_matrix.is_contiguous())  # True# 进行转置操作，造成非连续内存
torch_transposed = torch_matrix.T
print("\n转置后是否连续:", torch_transposed.is_contiguous())  # False# 应用contiguous使其连续
torch_cont = torch_transposed.contiguous()
print("应用contiguous后是否连续:", torch_cont.is_contiguous())  # True
print("数据内容是否相同:", torch.all(torch_transposed == torch_cont).item())  # True# NumPy 实现
print("\n2. NumPy ascontiguousarray:")
np_transposed = matrix.T
print("转置后是否C连续:", np_transposed.flags.c_contiguous)  # False# 使用ascontiguousarray
np_cont = np.ascontiguousarray(np_transposed)
print("应用ascontiguousarray后是否C连续:", np_cont.flags.c_contiguous)  # True# TensorFlow 没有直接对应的函数
print("\n3. TensorFlow 没有直接对应的contiguous函数，但可使用tf.identity创建副本")
tf_matrix = tf.constant(matrix)
tf_transposed = tf.transpose(tf_matrix)
tf_copy = tf.identity(tf_transposed)  # 创建副本，可能重新排列内存# Pandas 没有直接对应的函数
print("\n4. Pandas 没有直接对应的contiguous函数")# SciPy 稀疏矩阵有特殊的内存表示
print("\n5. SciPy 稀疏矩阵使用特殊的内存表示方式，可通过格式转换重新排列数据")

15. 获取张量维度 (Size)

概念：获取张量的维度信息（形状）
应用场景：
- 调试和验证张量维度
- 动态构建神经网络层
- 数据处理前的维度检查
- 动态调整批量大小

3D张量表示: shape=[2, 3, 4]
[[[值, 值, 值, 值],[值, 值, 值, 值],[值, 值, 值, 值]],[[值, 值, 值, 值],[值, 值, 值, 值],[值, 值, 值, 值]]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例多维数组
array = np.random.rand(2, 3, 4)# NumPy 实现
print("1. NumPy shape:")
print("shape属性:", array.shape)  # (2, 3, 4)
print("维度数量:", array.ndim)  # 3
print("元素总数:", array.size)  # 24# TensorFlow 实现
print("\n2. TensorFlow shape:")
tf_tensor = tf.constant(array)
print("shape属性:", tf_tensor.shape)  # (2, 3, 4)
print("维度数量:", tf.rank(tf_tensor))  # 3
print("元素总数:", tf.size(tf_tensor))  # 24# PyTorch 实现
print("\n3. PyTorch size:")
torch_tensor = torch.tensor(array)
print("size():", torch_tensor.size())  # torch.Size([2, 3, 4])
print("shape属性:", torch_tensor.shape)  # torch.Size([2, 3, 4])
print("指定维度大小 size(0):", torch_tensor.size(0))  # 2
print("维度数量:", torch_tensor.dim())  # 3# Pandas 实现
print("\n4. Pandas shape:")
df = pd.DataFrame(np.random.rand(3, 4))
print("shape属性:", df.shape)  # (3, 4)
print("行数:", df.shape[0])  # 3
print("列数:", df.shape[1])  # 4
print("元素总数:", df.size)  # 12# SciPy 实现
print("\n5. SciPy shape:")
sparse_matrix = sp.csr_matrix(np.random.rand(3, 4))
print("shape属性:", sparse_matrix.shape)  # (3, 4)
print("维度数量:", sparse_matrix.ndim)  # 2
print("非零元素数量:", sparse_matrix.nnz)

16. 张量重复 (Repeat)

概念：沿指定维度重复张量的元素
应用场景：
- 生成批量数据
- 构建注意力掩码
- 扩展特征向量以匹配其他张量
- 图像处理中的上采样

原始矩阵: [[1, 2],   repeat(2,1) → [[1, 2, 1, 2],[3, 4]]                   [3, 4, 3, 4]]原始矩阵: [[1, 2],   repeat(2,2) → [[1, 2, 1, 2],[3, 4]]                   [3, 4, 3, 4],[1, 2, 1, 2],[3, 4, 3, 4]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
matrix = np.array([[1, 2], [3, 4]])
print("原始矩阵:")
print(matrix)
print("形状:", matrix.shape)  # (2, 2)# NumPy 实现
print("\n1. NumPy tile/repeat:")
# np.tile - 类似PyTorch的repeat
np_tiled = np.tile(matrix, (2, 2))
print("np.tile((2, 2))后:")
print(np_tiled)
print("形状:", np_tiled.shape)  # (4, 4)# np.repeat - 沿特定轴重复元素
np_repeat_0 = np.repeat(matrix, 2, axis=0)
print("\nnp.repeat(2, axis=0)后:")
print(np_repeat_0)
print("形状:", np_repeat_0.shape)  # (4, 2)np_repeat_1 = np.repeat(matrix, 2, axis=1)
print("\nnp.repeat(2, axis=1)后:")
print(np_repeat_1)
print("形状:", np_repeat_1.shape)  # (2, 4)# TensorFlow 实现
print("\n2. TensorFlow tile/repeat:")
tf_matrix = tf.constant(matrix)
# tf.tile
tf_tiled = tf.tile(tf_matrix, [2, 2])
print("tf.tile([2, 2])后:")
print(tf_tiled.numpy())
print("形状:", tf_tiled.shape)  # (4, 4)# tf.repeat
tf_repeat_0 = tf.repeat(tf_matrix, 2, axis=0)
print("\ntf.repeat(2, axis=0)后:")
print(tf_repeat_0.numpy())
print("形状:", tf_repeat_0.shape)  # (4, 2)# PyTorch 实现
print("\n3. PyTorch repeat:")
torch_matrix = torch.tensor(matrix)
# repeat
torch_repeat = torch_matrix.repeat(2, 2)
print("torch.repeat(2, 2)后:")
print(torch_repeat)
print("形状:", torch_repeat.size())  # torch.Size([4, 4])# 使用expand和repeat组合
vector = torch.tensor([1, 2, 3])
vector_expanded = vector.unsqueeze(0)  # [1, 3]
vector_repeated = vector_expanded.repeat(3, 1)  # [3, 3]
print("\n向量经unsqueeze(0)和repeat(3, 1)后:")
print(vector_repeated)
print("形状:", vector_repeated.size())  # torch.Size([3, 3])# Pandas 实现
print("\n4. Pandas repeat:")
df = pd.DataFrame({'A': [1, 3], 'B': [2, 4]})
# 重复行
df_repeat_index = pd.DataFrame(np.repeat(df.values, 2, axis=0), columns=df.columns)
print("行重复2次:")
print(df_repeat_index)# SciPy 实现
print("\n5. SciPy 稀疏矩阵没有直接的repeat函数，可转换为密集数组后处理")

17. 上三角矩阵 (Triu)

概念：提取或创建上三角矩阵，即只保留主对角线和主对角线以上的元素
应用场景：
- 生成注意力掩码（自回归模型、Transformer解码器）
- 矩阵分解
- 线性方程组求解
- 避免重复计算（如距离矩阵）

原始矩阵:        triu操作后:       triu(k=1)后:
[[1, 2, 3],      [[1, 2, 3],      [[0, 2, 3],[4, 5, 6],  →    [0, 5, 6],  →    [0, 0, 6],[7, 8, 9]]       [0, 0, 9]]       [0, 0, 0]]

代码：

import numpy as np
import tensorflow as tf
import torch
import pandas as pd
import scipy.sparse as sp# 创建示例矩阵
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("原始矩阵:")
print(matrix)# NumPy 实现
print("\n1. NumPy triu:")
np_triu = np.triu(matrix)
print("np.triu(matrix):")
print(np_triu)# 指定对角线偏移 (k=1, 主对角线上方)
np_triu_k1 = np.triu(matrix, k=1)
print("\nnp.triu(matrix, k=1):")
print(np_triu_k1)# TensorFlow 实现
print("\n2. TensorFlow 上三角:")
tf_matrix = tf.constant(matrix)
# tf 2.x中的band_part实现triu
tf_triu = tf.linalg.band_part(tf_matrix, 0, -1)  # 0表示下方带宽，-1表示上方带宽无限
print("tf上三角矩阵:")
print(tf_triu.numpy())# 主对角线上方 (k=1)
tf_triu_k1 = tf.linalg.band_part(tf_matrix, -1, -1) - tf.linalg.band_part(tf_matrix, 0, -1)
print("\ntf主对角线上方:")
print(tf_triu_k1.numpy())# PyTorch 实现
print("\n3. PyTorch triu:")
torch_matrix = torch.tensor(matrix)
torch_triu = torch.triu(torch_matrix)
print("torch.triu():")
print(torch_triu)# 指定对角线偏移
torch_triu_k1 = torch.triu(torch_matrix, diagonal=1)
print("\ntorch.triu(diagonal=1):")
print(torch_triu_k1)# Pandas - 没有直接对应函数
print("\n4. Pandas 没有直接的triu函数，可通过NumPy实现:")
df = pd.DataFrame(matrix)
df_triu = pd.DataFrame(np.triu(df.values))
print(df_triu)# SciPy 实现
print("\n5. SciPy triu:")
sparse_matrix = sp.csr_matrix(matrix)
scipy_triu = sp.triu(sparse_matrix)
print("sp.triu():")
print(scipy_triu.toarray())# 指定对角线偏移
scipy_triu_k1 = sp.triu(sparse_matrix, k=1)
print("\nsp.triu(k=1):")
print(scipy_triu_k1.toarray())