j-bennet codes
Published in

j-bennet codes

How to use Pandas to access databases

and is that the right thing to do

https://gist.github.com/j-bennet/ce2f232ef8eda4cff95280d2ab7be19b
https://gist.github.com/j-bennet/ea28d615a831dc568a806d8b1b1dcbc6

But should you?

Explore the database using a CLI

https://gist.github.com/j-bennet/48c162a8f2d20f7718acb7bd5d1180af
https://gist.github.com/j-bennet/f067ef367d846e7dd323fcfe3e6fe339

Limit the fields to retrieve

https://gist.github.com/j-bennet/e85f7694cb4ec621f06c59a84c9b81da

Limit the records to retrieve

https://gist.github.com/j-bennet/2aefe79c0af40f870d6bb2703f94abca

Let database server handle joins

https://gist.github.com/j-bennet/ca2a85974c898707c02e3155308e0348

Estimate memory usage

https://gist.github.com/j-bennet/546a4c20fc9b178155acdc4aa6a158ab

Reduce memory usage with data types

https://gist.github.com/j-bennet/f3a9311aaceba79616bd0753451d7b1e
https://gist.github.com/j-bennet/ebdbcaf1fcca2f1957780ac5321ef247
  • Read the table (or query) in chunks, providing the chunksize parameter.
https://gist.github.com/j-bennet/ce556e9faed37c48336b34d442297b63
  • Convert datatypes of each chunk to smaller datatypes.
https://gist.github.com/j-bennet/a9a4971c7d48ee81341be4981e826504
  • Concatenate updated chunks into a new dataframe.
https://gist.github.com/j-bennet/3225de6081e210f56dd5bde21f07507e

Summary

  • Dataframes in Pandas are not lazy, they are loaded into memory, be aware of the memory usage.
  • Start exploring with a SQL client to determine the size and shape of data.
  • Proceed based on the size of data, to either load whole tables into Pandas, or query for only selected fields and possibly limit to a sample of records.
  • Let database do the joins, it’s good at it.
  • Estimate the size of data. If necessary, use memory-efficient data types.

--

--

On coding and data analysis, by Irina Truong (j-bennet@github)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store