Sourabh Jain
Aug 29, 2017 · 1 min read

I am getting error when I use read_pdf(‘http://www.wrldc.in/9_reportNew/dailydata_01082017.pdf’).

---------------------------------------------------------------------------
ParserError Traceback (most recent call last)
<ipython-input-10-cc2636d32540> in <module>()
----> 1 data = read_pdf('http://www.wrldc.in/9_reportNew/dailydata_01082017.pdf')

/Users/sourabhjain/anaconda3/lib/python3.6/site-packages/tabula/wrapper.py in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, **kwargs)
95 pandas_options['encoding'] = pandas_options.get('encoding', encoding)
96
---> 97 return pd.read_csv(io.BytesIO(output), **pandas_options)
98
99

/Users/sourabhjain/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
653 skip_blank_lines=skip_blank_lines)
654
--> 655 return _read(filepath_or_buffer, kwds)
656
657 parser_f.__name__ = name

/Users/sourabhjain/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
409
410 try:
--> 411 data = parser.read(nrows)
412 finally:
413 parser.close()

/Users/sourabhjain/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
980 raise ValueError('skipfooter not supported for iteration')
981
--> 982 ret = self._engine.read(nrows)
983
984 if self.options.get('as_recarray'):

/Users/sourabhjain/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
1717 def read(self, nrows=None):
1718 try:
-> 1719 data = self._reader.read(nrows)
1720 except StopIteration:
1721 if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11138)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:11884)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows (pandas/_libs/parsers.c:11755)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error (pandas/_libs/parsers.c:28765)()

ParserError: Error tokenizing data. C error: Expected 10 fields in line 18, saw 11

Kindly, help. I am using Mac- Sierra 10.12.4 and Jupyter Notebook. I have Java version 8 update 111.

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade