Files
Obsidian/Extras/Omnivore/数据处理过程.md

2.4 KiB

aliases, atlas, created, modified, tags, title
aliases atlas created modified tags title
Atlas/Card 2024-03-15 21:36:51 2024-03-15 21:55:16 数据处理过程

数据清洗

通过编写 Python 代码将对接人所提供的所有业务明细单整合为一个文件以便后续数据分析的进行

import os  
import pandas as pd  
  
# The directory containing your Excel files  
directory = 'E:/Projects/analyse'  
  
# List to hold data from each file  
all_data = []  
  
# Loop through each file in the directory  
for filename in os.listdir(directory):  
    if filename.endswith('.xlsx') or filename.endswith('.xls'):  
        file_path = os.path.join(directory, filename)  
        df = pd.read_excel(file_path)  
        all_data.append(df)  
  
# Concatenate all data into a single DataFrame  
merged_data = pd.concat(all_data, ignore_index=True)  
  
# Save the merged DataFrame to a new Excel file  
merged_data.to_excel('merged_data.xlsx', index=False)  
  
print("Files have been merged and saved as 'merged_data.xlsx'")

删去了以下列:序号、服务单号、调度单号、联系人、联系电话、患者信息、销售、介绍人、客服、调度、承包组、车牌、出车成员、医护出车和任务备注 经过确认,所有调度单状态不为已返回的订单均未产生收入,故将其全部筛选出来后将总成交价一列的数值改为 0 以免影响计算结果,按月营收额如下所示:

日期 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12 2023-01 2023-02 2023-03 2023-04 2023-05 2023-06 2023-07 2023-08 2023-09 2023-10 2023-11 2023-12 2024-01 2024-02
营收额 3328917.00 3362286.00 3973152.00 3462363.00 4250864.00 4144810.76 4360712.00 4587020.00 4880988.50 4197830.00 3309294.00 3338335.00 4069565.00 4292058.60 3101339.20 3834394.40 3114722.80 2750602.00 4161377.40 3465051.00 2898861.00 3426260.50 3559553.15

image.png|600