Reading PDF, XLS, XLSX, DOC, DOCX, CSV, TXT files content in NODEJS

Coding In depth
Coding In Depth
Published in
3 min readDec 31, 2019

--

Reading file content is very important as well as very difficult. There are two ways of reading files in NodeJS. Blocking or synchronous way and nonblocking or asynchronous way. Few days back I came to a scenario where we are uploading files and reading content. Most of the scenarios were either pdf or docs. I am combining all possible scenarios i.e. reading the doc, Docx, pdf, Xls, Xlsx, CSV and plain text. Reading CSV and text files is very easy, the fs module itself provides data. For pdf, Docx, Xlsx we have to install dependencies in NODEJS. Below are the dependencies that need to installed for different filetype.

  1. Install Dependencies:

Install dependencies to read the files. First install NodeJS file system. Second is pdf reader. Install Xlsx for reading Xls, xlsx workbooks. node-stream-zip is to read doc and Docx file.

//To read the filenpm install file-system --save//Read PDF contentnpm install pdfreader//Reading XLSXnpm install xlsx//Extract binaries and used for doc, docx file typenpm install node-stream-zip

--

--

Coding In depth
Coding In Depth

Writing about Angular, React, JavaScript, Java, C#, NodeJS, AWS, MongoDB, and Redis related articles. Please follow, clap stories to motivate us writing more!