本栏目持续更新一些零散的2025-2026年 CTF 题目,以此记录
CISCN@2025 The Silent Heist - AI 使用GaussianCopula进行数据高维关系分布的拟合,生成100000份数据,发现错误率在50/6000。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 import matplotlib.pyplot as plt import numpy as np import pandas as pd import seaborn as sns from copulae import GaussianCopula,elliptical, StudentCopula from scipy.stats import norm from sklearn.preprocessing import StandardScaler df = pd.read_csv('public_ledger.csv') df = df.fillna(df.mean()) scaler = StandardScaler() scaled_data = scaler.fit_transform(df) print(scaled_data.shape) _, ndim = scaled_data.shape copula = GaussianCopula(dim=ndim) copula.fit(scaled_data) new_data = copula.random(100000) original_distribution = norm(loc=df.mean(), scale=df.std()) recovered_data = original_distribution.ppf(new_data) generated_df = pd.DataFrame(recovered_data, columns=df.columns) generated_df.to_csv('generated_ledger.csv', index=False, encoding='utf-8') sns.heatmap(pd.DataFrame(recovered_data).corr(), annot=True) plt.title("Generated Data Correlation") plt.show() for i in range(recovered_data.shape[1]): plt.hist(recovered_data[:, i], bins=30, alpha=0.5, label=f"Feature {i}") plt.title("Generated Data Distribution") plt.legend() plt.show() 再进行数据清理,去除非常极端的异常值,使用IQR法
...