--- aliases: atlas: "[[Atlas/Card|Card]]" created: 2024-03-15 21:36:51 modified: 2024-03-15 21:55:16 tags: title: 数据处理过程 --- # 数据清洗 通过编写 Python 代码将对接人所提供的所有业务明细单整合为一个文件以便后续数据分析的进行 ```python import os import pandas as pd # The directory containing your Excel files directory = 'E:/Projects/analyse' # List to hold data from each file all_data = [] # Loop through each file in the directory for filename in os.listdir(directory): if filename.endswith('.xlsx') or filename.endswith('.xls'): file_path = os.path.join(directory, filename) df = pd.read_excel(file_path) all_data.append(df) # Concatenate all data into a single DataFrame merged_data = pd.concat(all_data, ignore_index=True) # Save the merged DataFrame to a new Excel file merged_data.to_excel('merged_data.xlsx', index=False) print("Files have been merged and saved as 'merged_data.xlsx'") ``` 删去了以下列:序号、服务单号、调度单号、联系人、联系电话、患者信息、销售、介绍人、客服、调度、承包组、车牌、出车成员、医护出车和任务备注 经过确认,所有调度单状态不为已返回的订单均未产生收入,故将其全部筛选出来后将总成交价一列的数值改为 0 以免影响计算结果,按月营收额如下所示: | 日期 | 2022-04 | 2022-05 | 2022-06 | 2022-07 | 2022-08 | 2022-09 | 2022-10 | 2022-11 | 2022-12 | 2023-01 | 2023-02 | 2023-03 | 2023-04 | 2023-05 | 2023-06 | 2023-07 | 2023-08 | 2023-09 | 2023-10 | 2023-11 | 2023-12 | 2024-01 | 2024-02 | | --- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | 营收额 | 3328917.00 | 3362286.00 | 3973152.00 | 3462363.00 | 4250864.00 | 4144810.76 | 4360712.00 | 4587020.00 | 4880988.50 | 4197830.00 | 3309294.00 | 3338335.00 | 4069565.00 | 4292058.60 | 3101339.20 | 3834394.40 | 3114722.80 | 2750602.00 | 4161377.40 | 3465051.00 | 2898861.00 | 3426260.50 | 3559553.15 | ![image.png|600](https://image.kfdr.top/i/2024/03/15/65f453278157a.png)