400,000 GitHub repositories, 1 billion files, 14 terabytes of code: Spaces or Tabs?
Felipe Hoffa

Bad methodic ruining all statistics you have provided.

First of all, very strange language selection. By historical reasons most of Python, Ruby or Coffeescript coders using spaces. Those languages using indentation as scope construct. Take a look. Yaml configuration behave as natural extension to those languages and forbid tabs usage. Because of this is easier to use space only for whole project than on just on some files.

Next thing is bad analyze. There is two parts of wrong assumptions — project scope and special formats. Of course, if your project was started with a python developer who uses spaces, in most cases project will contain only space indents. And if you’re using IDE which uses space indents by default and you don’t care, your project will use those default indentation.

Special formats is quite a vague story, but such formats are exists. For example, JSDoc have space indentation regardless of file indentation format. Or you’re definitely not checking for comments content, multiline strings, HEREDOC and other things.

And, afterall, your check is completely wrong. You’re just counting lines with tabs and spaces, then check for a first space or tab match (which can be in middle of the string) and then sum up values for a whole file. Result can be the same, but your methodic ruining result.

Show your support

Clapping shows how much you appreciated dotandthing’s story.