Files
system-design-101/data/guides/5-functions-to-merge-data-with-pandas.md
Kamran Ahmed ee4b7305a2 Adds ByteByteGo guides and links (#106)
This PR adds all the guides from [Visual
Guides](https://bytebytego.com/guides/) section on bytebytego to the
repository with proper links.

- [x] Markdown files for guides and categories are placed inside
`data/guides` and `data/categories`
- [x] Guide links in readme are auto-generated using
`scripts/readme.ts`. Everytime you run the script `npm run
update-readme`, it reads the categories and guides from the above
mentioned folders, generate production links for guides and categories
and populate the table of content in the readme. This ensures that any
future guides and categories will automatically get added to the readme.
- [x] Sorting inside the readme matches the actual category and guides
sorting on production
2025-03-31 22:16:44 -07:00

1.2 KiB

title, description, image, createdAt, draft, categories, tags
title description image createdAt draft categories tags
5 Functions to Merge Data with Pandas Explore 5 Pandas functions for efficient data merging and analysis. https://assets.bytebytego.com/diagrams/0192-five-pandas.jpg 2024-03-08 false
ai-machine-learning
Pandas
Data Manipulation

How do we quickly merge data without Microsoft Excel?

Here are 5 useful pandas functions for production data analysis.

  • Concat: this function supports the vertical and horizontal combination of two tables. Concat can quickly combine the data from different shards.
  • Append: this function supports the adding of data to an existing table. Append can be used in web crawlers. The new data can be appended to the table when it is crawled.
  • Merge: this function supports horizontal combination on keys. It works similarly to database joins. Merge can be used to combine data from different domains with the same keys.
  • Join: this function works similarly to database outer joins.
  • Combine: this function can apply calculations while combining two tables. The example below chooses the smaller value for the cell. Combine is useful for a data cleansing process.