<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Maanav Shah on Medium]]></title>
        <description><![CDATA[Stories by Maanav Shah on Medium]]></description>
        <link>https://medium.com/@maanavshah?source=rss-4ae26f7ee9c------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*IvWFlkWKvDdk6PrracHgJw.jpeg</url>
            <title>Stories by Maanav Shah on Medium</title>
            <link>https://medium.com/@maanavshah?source=rss-4ae26f7ee9c------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sat, 16 May 2026 13:15:55 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@maanavshah/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[What is hot cache vs cold cache?]]></title>
            <link>https://medium.com/@maanavshah/what-is-hot-cache-vs-cold-cache-7b00b4329893?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/7b00b4329893</guid>
            <category><![CDATA[cache]]></category>
            <category><![CDATA[memory-improvement]]></category>
            <category><![CDATA[inode]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Sun, 26 Mar 2023 11:55:28 GMT</pubDate>
            <atom:updated>2023-03-26T11:55:28.979Z</atom:updated>
            <content:encoded><![CDATA[<p>A cache is a structure that holds some values (inodes, memory pages, disk blocks, etc.) for faster lookup.</p><p>Cache works by storing some kind of short references in a fast search data structure (hash table, B+ Tree) or faster access media (RAM memory vs HDD, SSD vs HDD).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/678/0*l4fM78fvWwzwk2mb.jpg" /><figcaption>Cache vs Memory</figcaption></figure><p>To be able to do this fast search you need your cache to hold values. Let’s look at an example.</p><p>Say, you have a Linux system with some filesystem. To access files in the filesystem you need to know where your file starts at the disk. This information stored in the <strong>inode</strong>. For simplicity, we say that the inode table is stored somewhere on disk.</p><p>Now imagine, that you need to read file <em>/etc/fstab</em>. To do this you need to read inode table from disk (10 ms) then parse it and get start block of the file and then read the file itself(10ms). Total ~20ms</p><p>This is way too many operations. So you are adding a cache in form of a hash table in RAM. RAM access is 10ns — that’s 1000! times faster. Each row in that hash table holds 2 values.</p><pre>(inode number or filename) : (starting disk block)</pre><p>But the problem is that at the start your cache is empty — such cache is called <strong>cold cache</strong>. To exploit the benefits of your cache you need to fill it with some values. How does it happen? When you’re looking for some file you look in your inode cache. If you don’t find inode in the cache (<em>cache miss</em>) you’re saying ‘Okay’ and do full read cycle with inode table reading, parsing it and reading file itself. But after parsing part you’re saving inode number and parsed starting disk block in your cache. And that’s going on and on — you try to read another file, you look in cache, you get cache miss (your cache is cold), you read from disk, you add a row in the cache.</p><p>So cold cache doesn’t give you any speedup because you are still reading from disk. In some cases, the cold cache makes your system slower because you’re doing extra work (extra step of looking up in a table) to warm up your cache.</p><p>After some time you’ll have some values in your cache, and by some time you try to read the file, you look up in cache and BAM! you have found inode (<em>cache hit</em>)! Now you have starting disk block, so you skip reading superblock and start reading the file itself! You have just saved 10ms!</p><p>That cache is called <strong>warm cache</strong> — cache with some values that give you cache hits.</p><p><strong>TL;DR</strong> There is an analogy with a cold engine and warm engine of the car. Cold cache — doesn’t have any values and can’t give you any speedup because, well, it’s empty. Warm cache has some values and can give you that speedup.</p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on LinkedIn or Twitter. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7b00b4329893" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Upgrade Boto for Python 3 and Django 2.2]]></title>
            <link>https://awstip.com/upgrade-boto-for-python-3-and-django-2-2-ae89b35d801e?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/ae89b35d801e</guid>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[django]]></category>
            <category><![CDATA[boto3]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Fri, 24 Feb 2023 16:35:37 GMT</pubDate>
            <atom:updated>2023-03-13T18:15:11.813Z</atom:updated>
            <content:encoded><![CDATA[<p>I have recently written a blog post for upgrading your Python and Django services. This is an extension of that blog if you need to upgrade the Boto package for the same. <br><a href="https://medium.com/@maanavshah/upgrading-your-django-and-python-microservice-9cb480541907">https://medium.com/@maanavshah/upgrading-your-django-and-python-microservice-9cb480541907</a></p><h3>Boto</h3><p>Boto is the AWS package our app had been using while running on Python 2.7.</p><p>It is basically used for all Amazon S3-related operations, like:</p><ul><li>Upload file</li><li>Upload data (JSON/string)</li><li>Download file</li><li>Generate a signed URL of an S3 file</li></ul><p>However, the boto package is not completely supported on Python 3+, and AWS now widely promotes a much better version that it calls boto3 which has extensive support for Python 3. Hence, the shift to boto3 was imminent.</p><h4>What major changes were made code-wise?</h4><p>The change to boto3 is quite straightforward, and one can refer <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html">here</a> for particular documentation.</p><p>However, below are the major changes observed when migrating for the above-mentioned S3 operations:</p><ul><li>The boto3 <strong>client/resource</strong> object now also expects the aws_region param along with the aws_access_key_id and aws_secret_access_key of the bucket to perform any operation. <br>This isn’t a breaking change per se, but if the aws_region param isn’t provided, it tries to perform the operation on the default AWS region. For example,</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ceCHGQBUa_7gsMW8kmd6TQ.png" /></figure><ul><li>Earlier with boto, you would need to create a new S3Connection object, consecutively get a bucket object for the same, generate the remote key, and then perform the actual operation. For example,</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1pneZKu6Fd2lPH-CiWi5hw.png" /></figure><p>Now, with boto3, all one needs is an instance of the <strong>client/resource</strong>, after which one can begin performing the operation. For example,</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/802/1*nvsNdR3GN-j2YSYQdUDMTQ.png" /></figure><p><a href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html">Boto3 1.42.9 documentation</a></p><h3>That’s it. Hope this helps.</h3><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on LinkedIn or Twitter. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ae89b35d801e" width="1" height="1" alt=""><hr><p><a href="https://awstip.com/upgrade-boto-for-python-3-and-django-2-2-ae89b35d801e">Upgrade Boto for Python 3 and Django 2.2</a> was originally published in <a href="https://awstip.com">AWS Tip</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Upgrading your Django and Python microservice]]></title>
            <link>https://medium.com/@maanavshah/upgrading-your-django-and-python-microservice-9cb480541907?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/9cb480541907</guid>
            <category><![CDATA[django]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[upgrade]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Mon, 20 Feb 2023 12:17:03 GMT</pubDate>
            <atom:updated>2023-02-20T15:30:03.765Z</atom:updated>
            <content:encoded><![CDATA[<p>We at Blinkit have many services in Django. We decided to upgrade to Python 3.8 (earlier 2.7) and Django 2.2 (earlier 1.8). While it&#39;s a complicated process, upgrading to the latest version of Django has several benefits:</p><ul><li>New features, improvements, better performance, and bug fixes.</li><li>Older versions do not receive any security updates.</li><li>Upgrading to newer Django makes future upgrades as smooth as possible.</li></ul><h3>Prerequisite</h3><ul><li>Explore the difference between the different Python and Django versions. You should familiarize yourself with the changes that were made in the newer versions.</li><li>Choose your tool for maintaining compatibility between different python versions.</li><li>Explore coverage tools to calculate your test coverage, this will help in ensuring you have a working service.</li></ul><h3>Dependencies</h3><p>Along with Python, in most cases, it will be necessary to upgrade to the latest version of your Django-related dependencies as well.</p><h4>Caniusepython3</h4><p>Use <a href="https://pypi.org/project/caniusepython3">caniusepython3</a> to find out which of your dependencies are blocking your use of Python 3.</p><pre>python2 -m pip install caniusepython3</pre><p>Once you have identified the blocking dependencies, you can upgrade to the latest package that supports the Python 3 version.</p><h3>Start Upgrading</h3><p>Now you need to convert your python codebase from 2 to 3. For this, you can use different conversion tools that will make your life easy. We also need to hold active tasks/features and update all requirements in one go.</p><h4>Futurize</h4><p>We have used <a href="https://python-future.org/automatic_conversion.html#futurize-quick-start-guide">futurize</a> to make our codebase Python 2/3 compatible.</p><pre>pip install future</pre><p>You can <a href="https://python-future.org/automatic_conversion.html#futurize-quick-start-guide">futurize</a> your codebase in the following stages:</p><p><strong>Step 1:</strong></p><p>The goal for this step is to modernize the Python 2 code without introducing any dependencies (in the future or e.g. six) at this stage. The below command would write all changes to the original files and prepare a backup file having the original changes. If you don’t want changes to get written on their own, remove -w. It will only display the changes in the terminal.</p><pre>futurize --stage1 -w **/*.py</pre><p><strong>Step 2:</strong></p><p>The goal for this step is to get the tests passing first on Python 3 and then on Python 2 again with the help of the future package:</p><p><strong>a. </strong>If you don’t want changes to get written on their own, run the following command.</p><pre>futurize --stage2  **/*.py</pre><p>To apply the changes, add the -w argument. If you would like to futurize to import all the changed builtins to have their Py3 semantics on Py2, invoke it like this:</p><pre>futurize --stage2 --all-imports myfolder/*.py</pre><p><strong>b</strong>. Re-run your tests on Py3 now. Make changes until your tests pass on Py3.</p><p><strong>c</strong>. Now run your tests on Py2 and notice the errors. Add wrappers from the future to re-enable Python 2 compatibility. See the <a href="https://python-future.org/compatible_idioms.html#compatible-idioms">Cheat Sheet: Writing Python 2–3 compatible code</a> cheat sheet and <a href="https://python-future.org/what_else.html#what-else">What else you need to know</a> for more info.</p><p><strong>d</strong>. After each change, re-run the tests on Py3 and Py2 to ensure they pass on both.</p><p><strong>e</strong>. Don’t forget to include the future as a dependency in your requirements.txt.</p><p><strong>f</strong>. You’re done! Celebrate! Commit your changes and push your code.</p><blockquote>Note: It is very important to review the changes after converting from python 2 to 3 (Use <a href="https://python-future.org/compatible_idioms.html#compatible-idioms">cheat sheet</a>). Some functions have extra memory overhead and a few are inefficient for python 3.</blockquote><h4>2to3</h4><p>We can also use the <a href="https://docs.python.org/3/library/2to3.html">2to3</a> package to convert our codebase from Python 2 to Python 3.</p><ul><li>Install <a href="https://docs.python.org/3/library/2to3.html">2to3</a> and run on *.py</li><li><a href="https://docs.python.org/3/library/2to3.html">2to3</a> will overwrite existing python files and also create backup files for those files that are modified.</li></ul><h3>Python caveats</h3><p>Although Futurize and 2to3 will make your code compatible with Python 3, still you’ve to manually check your entire repository for some weird errors.</p><h4>Division operator</h4><p>Python 2 and Python 3 division operator works differently.</p><pre># python 2<br>x = 10/5<br>print x # 2<br><br># python3<br>x = 10/5<br>print(x) # 2.0<br><br># python 2<br>for i in range(product_qty/10):<br><br># python 3<br>for i in range(int(product_qty/10)):</pre><p>Make sure you’re not doing some float division on iteration in your entire repository.</p><h4>Rounding decimals</h4><pre># python 2<br><br>from decimal import Decimal<br>round(Decimal(&quot;10.121&quot;), 2) # 10.12<br><br># python 3<br><br>from decimal import Decimal<br>round(Decimal(&quot;10.121&quot;), 2) # Decimal(&quot;10.12&quot;)</pre><h4>Unicode str</h4><pre># python 2<br>from hashlib import md5<br>md5(&quot;abcd&quot;).hexdigest() # &#39;e2fc714c4727ee9395f324cd2e7f331f&#39;<br><br># python 3<br>from hashlib import md5<br>md5(&quot;abcd&quot;).hexdigest() # TypeError: Unicode-objects must be encoded before hashing</pre><p>Check out this<a href="https://portingguide.readthedocs.io/en/latest/strings.html"> porting guide</a> for more caveats on str/Unicode.</p><h3>Django Changes</h3><p>Following are the changes and challenges we faced after upgrading to Django 2.2 from Django 1.8.</p><h4>On Delete</h4><p>Django 2.2 expects an <em>on_delete</em> argument while adding a ForeignKey and <em>OneToOneField</em> using Django models. So we added a <em>on_delete=models.DO_NOTHING</em> argument at existing ForeignKey declarations. We also need to add the same changes to the Django migration files. This will create a new migration file. However, since there should be no alteration in the database, we decided to fake the migration.</p><h4>Transaction hooks and Database Client</h4><p>In Django 1.8, we used <em>transaction_hooks.backends.mysql</em> as our default MySQL Database engine. However, in Django 2.2 transaction_hooks were included in the Django db package by default. So, we updated our engine as <em>django.db.backends.mysql</em>, for both, master and slave configuration. Also, the MySQL client we used MySQL-python was not supported by Python 3. We upgraded the package to use mysqlclient instead.</p><h4>404Handler</h4><p>If you’ve made some <a href="https://docs.djangoproject.com/en/2.2/ref/views/#django.views.defaults.page_not_found">custom</a> 404handler view then in the newer Django version you need to accept exceptions as a <em>required argument</em> in your view.</p><h4>Backward incompatible changes</h4><p>Depending on which Django version you’re moving away from, you should refer to release notes thoroughly for incompatible changes.<br>- <a href="https://docs.djangoproject.com/en/1.11/releases/1.11/#backwards-incompatible-changes-in-1-11">Migrating</a> from Django1.8 to Django1.11<br>- <a href="https://docs.djangoproject.com/en/2.2/releases/2.2/#backwards-incompatible-2-2">Migrating</a> from Django1.11 to Django2.2</p><h3>Django Rest Framework</h3><p>You need to upgrade your DRF because it is deprecated for older versions of django.</p><p>Some changes we made after upgrading DRF to 3.12.2:</p><h4>Serializer Fields</h4><p>While defining any serializer, Django 2.2 expects the fields to be declared in the Meta class. So, for the existing serializer, we added the following code to resolve the issue:</p><pre>class Meta:<br>  fields = &#39;__all__&#39;</pre><h4>list_route and detail_route decorators</h4><ul><li>Replace <em>detail_route</em> uses with <em>@action(detail=True)</em></li><li>Replace <em>list_route</em> uses with <em>@action(detail=False)</em></li></ul><h3>Celery and RabbitMQ</h3><p>A lot of changes happened in both celery and python in recent years.</p><ul><li>async and await are now keywords in python&gt;=3.5 which caused a lot of trouble for some <a href="https://github.com/celery/celery/issues/4500">celery versions</a></li><li>celery&gt;=4.0 <a href="https://docs.celeryproject.org/en/stable/userguide/calling.html#serializers">changed its default</a> task_serializer from <strong>pickle</strong> to <strong>json</strong> which means for your older celery this might work <em>def task(model_instance)</em> but won’t work in newer ones unless you explicitly specify pickle as your task_serializer in your settings.py</li><li>We didn&#39;t want to interfere with old celery queues, so we created a new virtual host (vhost) in Rabbitmq keeping the Celery queue name the same. With this, we ensured that new workers will pick tasks from new queues, and old workers (running on old pods) will keep consuming from the older queues.</li><li>Check your CELERY_BROKER_URL to find out which user and vhost is used in making the connection with Rabbitmq. To create a new vhost in Rabbitmq, do the following:</li></ul><pre>1ssh &lt;rabbitmq_instance&gt; 2sudo rabbitmqctl add_vhost v1 3sudo rabbitmqctl set_permissions -p &quot;v1&quot; &quot;user&quot; &quot;.*&quot; &quot;.*&quot; &quot;.*&quot;</pre><h3>Redis</h3><p>The pickle version used by both versions is different. Django 1.8 uses pickle version 2 and Django 2.2 uses pickle version 4. This throws unsupported pickle type ValueError.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/614/1*N7op6G0EHvdkVi8zcFeu7w.png" /></figure><p>The default data format in which python 3 stores string is Unicode and that of python 2 is Bytes. So, while using set and get in python 2 and 3 environments, we will read different values which will break our system. For example, if I set a value in the python 3 environment and try to read it in the python 2 environment it will interpret it as Unicode.</p><p>So our solution was to use a <strong>different Namespace </strong>(Table) in Redis. This allows us to separate the Redis namespace and avoid the pickle version collision. For example, to use DB 10 in the Redis namespace you can use the following configuration.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*GnYW0IrDSmxUGVfrGrxKug.png" /></figure><h3>That’s it. Hope this helps.</h3><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on LinkedIn or Twitter. Thank you!</em></p><h4>References</h4><p><a href="https://python-future.org/compatible_idioms.html">https://python-future.org/compatible_idioms.html</a></p><p><a href="https://docs.djangoproject.com/en/2.2/releases/2.2/#backwards-incompatible-2-2">https://docs.djangoproject.com/en/2.2/releases/2.2/#backwards-incompatible-2-2</a></p><p><a href="https://docs.djangoproject.com/en/1.11/releases/1.11/#backwards-incompatible-changes-in-1-11">https://docs.djangoproject.com/en/1.11/releases/1.11/#backwards-incompatible-changes-in-1-11</a></p><p><a href="https://docs.celeryproject.org/en/stable/whatsnew-5.0.html#upgrading-from-celery-4-x">https://docs.celeryproject.org/en/stable/whatsnew-5.0.html#upgrading-from-celery-4-x</a></p><p><a href="https://django-redis-cache.readthedocs.io/en/latest/advanced_configuration.html">https://django-redis-cache.readthedocs.io/en/latest/advanced_configuration.html</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9cb480541907" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Performance optimization for LIKE queries in PostgreSQL]]></title>
            <link>https://medium.com/swlh/performance-optimization-for-like-queries-in-postgresql-514ba73d9244?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/514ba73d9244</guid>
            <category><![CDATA[postgres]]></category>
            <category><![CDATA[index]]></category>
            <category><![CDATA[like]]></category>
            <category><![CDATA[performance]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Tue, 28 Jun 2022 14:47:47 GMT</pubDate>
            <atom:updated>2022-07-06T08:07:38.431Z</atom:updated>
            <content:encoded><![CDATA[<p>Here at blinkit, we’re trying to make sure that we query data quickly and efficiently. Poor queries, means we’re wasting both time and expensive resources.</p><p>One of the website’s most common searches is the product catalog. In the fashion e-commerce industry, users will search for different clothing brands. For example, if you want to buy an Apple MacBook Pro, you may either search for Apple or MacBook</p><p>So for querying such a column we generally use the <strong>LIKE/ILIKE</strong> and <strong>~/~*</strong> operator. A PostgreSQL query will look like this:</p><pre>SELECT <em>* </em>FROM product where product_name like ‘Apple%’;</pre><pre>SELECT <em>* </em>FROM product where product_name ilike ‘%macbook%’;</pre><p>And it is super slow if we don’t add an index. So if I just add an index here, it will create a <strong>B-tree index</strong></p><p><strong>B-tree</strong> is a self-balancing tree that maintains sorted data and allows operation in logarithmic time. B-trees can handle range queries on sorted data <em>(&lt;, ≤, &gt;, ≥, between, in, is null, is not null)</em></p><pre>CREATE INDEX product_idx_1 ON product_name;</pre><p>B-tree index can also be used for queries that involve pattern matching operator <strong>LIKE or ~</strong> if the pattern is a constant and the anchor is at the beginning of the pattern. For example, you can try matching queries column_name LIKE ‘Apple%’ or column_name ~ ‘^Apple’</p><p>But, querying ‘%macbook%’ or ‘%pro’ will not be efficient. For such queries, the query planner will resort to <strong>full-text sequential search</strong> which is not optimized.</p><h3>Enter, GIN indexes.</h3><p><em>GIN stands for Generalized Inverted Indexes.</em></p><p>We can create a GIN index to speed up text searches:</p><pre>CREATE INDEX index_name ON table USING GIN (<em>to_tsvector</em>(‘english’, column_name));</pre><p>The query above specifies that the English configuration can be used to parse and normalize the strings. And for the part of searching, a simple query to print the title of each row that contains the word friend in its body field is:</p><pre>SELECT <em>* </em>FROM table WHERE <em>to_tsvector</em>(‘english’, column_name) @@ <em>to_tsquery</em>(‘english’, &#39;text_to_search);&#39;</pre><p>This will also find related words, for example, if you search <em>friend</em>, it will also search for words such as <em>friends</em> and <em>friendly,</em> since all these are reduced to the same normalized lexeme.</p><p>For testing this out, I created a <strong>movies</strong> table with ~1 million records:<br>(Please find the repo here: <a href="https://github.com/maanavshah/gin-index-101">https://github.com/maanavshah/gin-index-101</a>)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gBp1CyvWr1go7pKmbU1Ntg.png" /><figcaption>table structure — movies</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*IabMsjzLC5WF9SeZxxGPGg.png" /><figcaption>select <em>* </em>from movies limit 10;</figcaption></figure><p>We will be comparing the performance for the following queries:</p><h4>Query 1: beginning of the pattern</h4><pre>EXPLAIN ANALYSE SELECT * FROM movies WHERE title LIKE &#39;Pirate%&#39;;</pre><h4>Query 2: beginning of the pattern (case-sensitive)</h4><pre>EXPLAIN ANALYSE SELECT * FROM movies WHERE title ILIKE &#39;Pirate%&#39;;</pre><h4>Query 3: contains pattern in the middle</h4><pre>EXPLAIN ANALYSE SELECT * FROM movies WHERE title ILIKE &#39;%sea%&#39;;</pre><h3>No index performance</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*TBqyEvKhYUA5sN84JD2goQ.png" /><figcaption>Execution Time: 51.886 ms</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*C03_bjSIVXCuN9jmdvFeZQ.png" /><figcaption>Execution Time: 109.932 ms</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-qzpZr0eSLh64-36EbKb_A.png" /><figcaption>Execution Time: 152.286 ms</figcaption></figure><h4>B-tree index performance</h4><p>We can observe that on creating a B-tree index, the performance for the beginning of the pattern query has improved from <strong>51.8 ms </strong>to <strong>2.6 ms</strong>, however, performance for queries 2 and 3 did not improve.</p><pre>CREATE INDEX movies_name_idx_0 ON movies (title);</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*eVGCX5VfJUI237_KFpjokw.png" /><figcaption>Execution Time: 2.629 ms</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*bS7tQ6_yq1KTkfsb9WPWoA.png" /><figcaption>Execution Time: 123.529 ms</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*YywPo_81DWqFMoHcMQyJrw.png" /><figcaption>Execution Time: 129.386 ms</figcaption></figure><pre>DROP INDEX movies_name_idx_0;</pre><h4>GIN index performance</h4><pre>CREATE INDEX movies_name_idx_1 ON movies USING GIN (to_tsvector(&#39;english&#39;, title));</pre><p>The syntax for running queries 2 and 3 is as follows:</p><pre>EXPLAIN ANALYZE SELECT <em>* </em>FROM movies WHERE <em>to_tsvector</em>(&#39;english&#39;, title) @@ <em>to_tsquery</em>(&#39;english&#39;, &#39;Pirate&#39;);</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1sYB_EloYoWEJQ2fDM8tPA.png" /><figcaption>Execution Time: 2.294 ms</figcaption></figure><pre>EXPLAIN ANALYZE SELECT * FROM movies WHERE to_tsvector(&#39;english&#39;, title) @@ to_tsquery(&#39;english&#39;, &#39;sea&#39;);</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gX9ZaN3tZJsldJFeavWXkg.png" /><figcaption>Execution Time: 6.330 ms</figcaption></figure><pre>DROP INDEX movies_name_idx_1;</pre><p><strong>Conclusion:</strong></p><p>We can observe that on creating the GIN index, performance for queries 2 and 3 has significantly improved when compared with B-tree. The time taken by the GIN index was reduced from <strong>129.3 ms</strong> to <strong>6.3 ms</strong> over B-tree for matching anchor in the middle.</p><h4>Reference:</h4><p><a href="https://pganalyze.com/blog/gin-index">https://pganalyze.com/blog/gin-index<br></a><a href="https://www.postgresql.org/docs/13/textsearch-tables.html">https://www.postgresql.org/docs/13/textsearch-tables.html</a></p><p>You can also use <strong>pg_trgm</strong> for similar use cases.<br><a href="https://niallburkley.com/blog/index-columns-for-like-in-postgres/">https://niallburkley.com/blog/index-columns-for-like-in-postgres/</a></p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on LinkedIn or Twitter. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=514ba73d9244" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/performance-optimization-for-like-queries-in-postgresql-514ba73d9244">Performance optimization for LIKE queries in PostgreSQL</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Why use Nginx for Flask/Django/RoR?]]></title>
            <link>https://medium.com/analytics-vidhya/why-use-nginx-for-flask-django-ror-d31a00de2141?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/d31a00de2141</guid>
            <category><![CDATA[ruby-on-rails]]></category>
            <category><![CDATA[web-server]]></category>
            <category><![CDATA[flask]]></category>
            <category><![CDATA[nginx]]></category>
            <category><![CDATA[application-server]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Fri, 17 Jul 2020 21:18:37 GMT</pubDate>
            <atom:updated>2020-07-21T05:23:07.143Z</atom:updated>
            <content:encoded><![CDATA[<h3>Why use Nginx? Flask, Django, ROR, NodeJS are not Production Server.</h3><h4>Why do we use Nginx, a web server in front of an application such as Flask, Django, Ruby on Rails, NodeJS, etc?</h4><p>I will be talking about <strong>Flask</strong> over here, but this applies to all Frameworks such as <strong>Django, Ruby on Rails, NodeJS, etc.</strong></p><p>If you are interested to know, how to deploy Flask Applications with uWSGI and Nginx, please check <a href="https://medium.com/@maanavshah/deploy-flask-applications-with-uwsgi-and-nginx-on-ubuntu-18-04-2a47f378c3d2">this</a> out.</p><p>You’ve built your Flask web app and are working on deploying the site. It’s your first, small app and you kinda expected that setting <strong>debug</strong> to <strong>False</strong> on the <strong>app.run</strong> should be enough. Maybe enable <strong>threaded</strong> too?</p><p>You really shouldn’t rely on that. The official docs <a href="https://flask.palletsprojects.com/en/1.1.x/deploying/">disagree as well</a>. It clearly states <strong>Flask’s built-in server is not suitable for production.</strong></p><p>What now? Well, no need to be confused. All is fine, you just need to understand what the Flask development web server is meant for, what it lacks and what to use instead.</p><h3>Flask’s Built-In Web Server</h3><p>The built-in Flask web server is provided for development convenience.</p><p>With it, you can make your app accessible on your local machine without having to set up other services and make them play together nicely. However, it is only meant to be used by one person at a time and is built this way. It can also serve static files but does so <em>very slowly</em> compared to tools that are built to do it quickly. This does not matter when only one person is accessing it, so it’s perfect for what it is meant for.</p><p>When running a web app in production, you want it to be able to handle multiple users and many requests, without those fine people having to wait noticeable amounts of time for the pages and static files to load.</p><h3>A Production Stack</h3><p>A production setup usually consists of multiple components, each designed and built to be really good at one specific thing. They are fast, reliable and very focused.</p><p>Communication with the whole thing, as in the case of the built-in web server, happens via HTTP. A request comes in and arrives at the first component — a dedicated <strong>web server</strong>. It is great at reading static files from disk (your CSS files for example) and handling multiple requests. When a request is not a static file but meant for your all it gets passed on down the stack.</p><p>The <strong>application server</strong> gets those fancy requests and converts the information from them into Python objects which are usable by frameworks. How this is supposed to happen is described by a specification people agreed on — <a href="https://en.wikipedia.org/wiki/Web_Server_Gateway_Interface"><strong>WSGI</strong></a>.</p><p>Your Flask app does not actually <em>run</em> as you would think a server would — waiting for requests and reacting to them. It can be seen as a function that is called by the application server, being provided the request object.</p><p>The output of running your app is then packaged up into an HTTP response by the application server and passed back to the webserver to be delivered back to the user.</p><h3>Conclusion</h3><p>If you want to run Flask in production, be sure to use a production-ready web server like <strong>Nginx</strong>, and let your app be handled by a <strong>WSGI</strong> application server like <strong>Gunicorn</strong>.</p><p>If you plan on running on Heroku, a web server is provided implicitly. You just need to specify a command to run the application server (again, Gunicorn is fine) in the Procfile.</p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on Twitter or Facebook. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d31a00de2141" width="1" height="1" alt=""><hr><p><a href="https://medium.com/analytics-vidhya/why-use-nginx-for-flask-django-ror-d31a00de2141">Why use Nginx for Flask/Django/RoR?</a> was originally published in <a href="https://medium.com/analytics-vidhya">Analytics Vidhya</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Deploy Flask Applications With uWSGI and Nginx on Ubuntu 18.04]]></title>
            <link>https://medium.com/swlh/deploy-flask-applications-with-uwsgi-and-nginx-on-ubuntu-18-04-2a47f378c3d2?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/2a47f378c3d2</guid>
            <category><![CDATA[nginx]]></category>
            <category><![CDATA[deployment]]></category>
            <category><![CDATA[flask]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[wsgi]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Tue, 07 Jul 2020 10:37:12 GMT</pubDate>
            <atom:updated>2020-07-31T15:50:06.783Z</atom:updated>
            <content:encoded><![CDATA[<p>If you are interested to know, <a href="https://medium.com/@maanavshah/why-use-nginx-for-flask-django-ror-d31a00de2141">why do we use Nginx in front of an application such as Flask, Django, Ruby on Rails, NodeJS, etc</a>?</p><p>You can also read about, <a href="https://www.fullstackpython.com/wsgi-servers.html">why is WSGI necessary?</a></p><p>When you have an Ubuntu or any Linux server and want to set up Flask application using Nginx and uWSGI, you need to log in as your non-root user to begin using:</p><pre>ssh ip_address</pre><h3>Step 1 — Installing Nginx</h3><p>Because Nginx is available in Ubuntu’s default repositories, it is possible to install it from these repositories using the apt packaging system.</p><p>Since this is our first interaction with the apt packaging system in this session, we will update our local package index so that we have access to the most recent package listings. Afterward, we can install nginx:</p><pre>sudo apt-get update<br>sudo apt-get install nginx</pre><h3>Step 2 — Checking your Web Server</h3><p>At the end of the installation process, Ubuntu 18.04 starts Nginx. The web server should already be up and running.</p><p>We can check with the systemd init system to make sure the service is running by typing:</p><pre>sudo systemctl status nginx</pre><p>As you can see below, the service appears to have started successfully.</p><pre>Output<br>● nginx.service - A high performance web server and a reverse proxy server<br>   Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)<br>   Active: active (running) since Tue 2020-07-07 07:50:48 UTC; 53min ago<br>     Docs: man:nginx(8)<br> Main PID: 10441 (nginx)<br>    Tasks: 2 (limit: 4373)<br>   Memory: 2.9M<br>   CGroup: /system.slice/nginx.service<br>           ├─10441 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;<br>           └─10443 nginx: worker process</pre><p>However, the best way to test this is to actually request a page from Nginx.</p><p>When you have your server’s IP address, enter it into your browser’s address bar:</p><pre><a href="http://your_server_ip">http://your_server_ip</a></pre><p>You should see the default Nginx landing page:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*czQirWf-0g69Sd_sb_XaIw.png" /></figure><p>This page is included with Nginx to show you that the server is running correctly.</p><h3>Step 3 — Managing the Nginx Process</h3><p>Now that you have your web server up and running, let’s review some basic management commands.</p><p>To stop your web server, type:</p><pre>sudo systemctl stop nginx</pre><p>To start the webserver when it is stopped, type:</p><pre>sudo systemctl start nginx</pre><p>To stop and then start the service again, type:</p><pre>sudo systemctl restart nginx</pre><p>If you are simply making configuration changes, Nginx can often reload without dropping connections. To do this, type:</p><pre>sudo systemctl reload nginx</pre><p>By default, Nginx is configured to start automatically when the server boots. If this is not what you want, you can disable this behavior by typing:</p><pre>sudo systemctl disable nginx</pre><p>To re-enable the service to start up at boot, you can type:</p><pre>sudo systemctl enable nginx</pre><h3>Step 4 — Installing the Components from the Ubuntu Repositories</h3><p>Our first step will be to install all of the pieces that we need from the Ubuntu repositories. We will install pip, the Python package manager, to manage our Python components. We will also get the Python development files necessary to build uWSGI.</p><p>First, let’s update the local package index and install the packages that will allow us to build our Python environment. These will include python-pip3, along with a few more packages and development tools necessary for a robust programming environment:</p><pre>sudo apt install python3-pip python3-dev build-essential libssl-dev libffi-dev python3-setuptools</pre><p>With these packages in place, let’s move on to creating a virtual environment for our project.</p><h3>Step 5 — Creating a Python Virtual Environment</h3><p>Next, we’ll set up a virtual environment in order to isolate our Flask application from the other Python files on the system.</p><p>Start by installing the python3-venv package, which will install the venv module:</p><pre>pip3 install virtualenv</pre><p>Next, let’s make a parent directory for our Flask project. Move into the directory after you create it:</p><pre>mkdir ~/myproject<br>cd ~/myproject</pre><p>Create a virtual environment to store your Flask project’s Python requirements by typing:</p><pre>python3 -m virtualenv myprojectenv</pre><p>This will install a local copy of Python and pip into a directory called myprojectenv within your project directory.</p><p>Before installing applications within the virtual environment, you need to activate it. Do so by typing:</p><pre>source myprojectenv/bin/activate</pre><p>Your prompt will change to indicate that you are now operating within the virtual environment. It will look something like this (myprojectenv)user@host:~/myproject$.</p><h3>Step 6 — Setting Up a Flask Application</h3><p>Now that you are in your virtual environment, you can install Flask and uWSGI and get started on designing your application.</p><p>First, let’s install wheel with the local instance of pip to ensure that our packages will install even if they are missing wheel archives:</p><pre>pip install wheel</pre><p>Next, let’s install Flask and uWSGI:</p><pre>pip install uwsgi flask</pre><h3>Creating a Sample App</h3><p>Now that you have Flask available, you can create a simple application. Flask is a microframework. It does not include many of the tools that more full-featured frameworks might, and exists mainly as a module that you can import into your projects to assist you in initializing a web application.</p><p>While your application might be more complex, we’ll create our Flask app in a single file, called myproject.py:</p><pre>vi ~/myproject/myproject.py</pre><p>The application code will live in this file. It will import Flask and instantiate a Flask object. You can use this to define the functions that should be run when a specific route is requested:</p><pre>from flask import Flask<br>app = Flask(__name__)</pre><pre>@app.route(&quot;/&quot;)<br>def hello():<br>    return &quot;&lt;h1 style=&#39;color:blue&#39;&gt;Hello There!&lt;/h1&gt;&quot;</pre><pre>if __name__ == &quot;__main__&quot;:<br>    app.run(host=&#39;0.0.0.0&#39;)</pre><p>This basically defines what content to present when the root domain is accessed. Save and close the file when you’re finished.</p><p>Now, you can test your Flask app by typing:</p><pre>python myproject.py</pre><p>You will see output like the following, including a helpful warning reminding you not to use this server setup in production:</p><pre>Output<br>* Serving Flask app “myproject” (lazy loading)<br> * Environment: production<br> WARNING: Do not use the development server in a production environment.<br> Use a production WSGI server instead.<br> * Debug mode: off<br> * Running on <a href="http://0.0.0.0:5000/">http://0.0.0.0:5000/</a> (Press CTRL+C to quit)</pre><p>When you are finished, hit CTRL-C in your terminal window to stop the Flask development server.</p><h3>Creating the WSGI Entry Point</h3><p>Next, let’s create a file that will serve as the entry point for our application. This will tell our uWSGI server how to interact with it.</p><p>Let’s call the file wsgi.py:</p><pre>vi ~/myproject/wsgi.py</pre><p>In this file, let’s import the Flask instance from our application and run it:</p><pre>from myproject import app</pre><pre>if __name__ == &quot;__main__&quot;:<br>    app.run()</pre><p>Save and close the file when you are finished.</p><p>We’re now done with our virtual environment, so we can deactivate it:</p><pre>deactivate</pre><p>Any Python commands will now use the system’s Python environment again.</p><h3>Creating a uWSGI Configuration File</h3><p>You have tested that uWSGI is able to serve your application, but ultimately you will want something more robust for long-term usage. You can create a uWSGI configuration file with the relevant options for this.</p><p>Let’s place that file in our project directory and call it myproject.ini:</p><pre>vi ~/myproject/myproject.ini</pre><p>Let’s put the content of our configuration file:</p><pre>[uwsgi]<br>module = wsgi:app</pre><pre>master = true<br>processes = 5</pre><pre>socket = myproject.sock<br>chmod-socket = 660<br>vacuum = true</pre><pre>die-on-term = true</pre><pre>logto = /home/maanav/myproject/myproject.log</pre><p>When you are finished, save and close the file.</p><p><strong><em>Note -</em></strong><em> Please remember to change </em>maanav<em> to your username.</em></p><h3>Step 7 — Creating a systemd Unit File</h3><p>Next, let’s create a systemd service unit file. Creating a systemd unit file will allow Ubuntu’s init system to automatically start uWSGI and serve the Flask application whenever the server boots.</p><p>Create a unit file ending in .service within the /etc/systemd/system directory to begin:</p><pre>sudo vi /etc/systemd/system/myproject.service</pre><p>Let’s put the content of our server file:</p><pre>[Unit]<br>Description=uWSGI instance to serve myproject<br>After=network.target</pre><pre>[Service]<br>User=maanav<br>Group=www-data<br>WorkingDirectory=/home/maanav/myproject<br>Environment=&quot;PATH=/home/maanav/myproject/myprojectenv/bin&quot;<br>ExecStart=/home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini</pre><pre>[Install]<br>WantedBy=multi-user.target</pre><p>With that, our systemd service file is complete. Save and close it now.</p><p>We can now start the uWSGI service we created and enable it so that it starts at boot:</p><pre>sudo systemctl start myproject<br>sudo systemctl enable myproject</pre><p>Let’s check the status:</p><pre>sudo systemctl status myproject</pre><p>You should see output like this:</p><pre>Output<br>● myproject.service - uWSGI instance to serve myproject<br>   Loaded: loaded (/etc/systemd/system/myproject.service; enabled; vendor preset: enabled)<br>   Active: active (running) since Fri 2018-07-13 14:28:39 UTC; 46s ago<br> Main PID: 30360 (uwsgi)<br>    Tasks: 6 (limit: 1153)<br>   CGroup: /system.slice/myproject.service<br>           ├─30360 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini<br>           ├─30378 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini<br>           ├─30379 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini<br>           ├─30380 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini<br>           ├─30381 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini<br>           └─30382 /home/maanav/myproject/myprojectenv/bin/uwsgi --ini myproject.ini</pre><p>If you see any errors, be sure to resolve them before continuing with the tutorial.</p><h3>Step 8 — Configuring Nginx to Proxy Requests</h3><p>Our uWSGI application server should now be up and running, waiting for requests on the socket file in the project directory. Let’s configure Nginx to pass web requests to that socket using the uwsgi protocol.</p><p>Begin by creating a new server block configuration file in Nginx’s sites-available directory. Let’s call this myproject to keep in line with the rest of the guide:</p><pre>sudo nano /etc/nginx/sites-available/myproject</pre><p>Open up a server block and tell Nginx to listen on the default port 80. Let’s also tell it to use this block for requests for our server’s domain name:</p><pre>server {<br>    listen 80;<br>    server_name your_domain <a href="http://www.your_domain">www.your_domain</a>;</pre><pre>location / {<br>        include uwsgi_params;<br>        uwsgi_pass unix:/home/maanav/myproject/myproject.sock;<br>    }<br>}</pre><p>If we do not have a domain registered, then we can add IP address as server name:</p><pre>server {<br>    listen 80;<br>    server_name ip_address;</pre><pre>location / {<br>        include uwsgi_params;<br>        uwsgi_pass unix:/home/maanav/myproject/myproject.sock;<br>    }<br>}</pre><p>Save and close the file when you’re finished.</p><p>To enable the Nginx server block configuration you’ve just created, link the file to the sites-enabled directory:</p><pre>sudo ln -s /etc/nginx/sites-available/myproject /etc/nginx/sites-enabled</pre><p>With the file in that directory, we can test for syntax errors by typing:</p><pre>sudo nginx -t</pre><p>If this returns without indicating any issues, restart the Nginx process to read the new configuration:</p><pre>sudo systemctl restart nginx</pre><p>You should now be able to navigate to your server’s domain name in your web browser:</p><p><a href="http://your_domain">http://your_domain</a></p><p><a href="http://your_domain">http://ip_address</a></p><p>You should see your application output:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/568/1*7qELrDk6YVRi-fCdNl_Ddw.png" /></figure><h3>Step 9 — Managing the application process</h3><p>Now that you have your application up and running, let’s review some basic management commands.</p><p>To stop your application, type:</p><pre>sudo systemctl stop myproject</pre><p>To start the application when it is stopped, type:</p><pre>sudo systemctl start myproject</pre><p>To stop and then start the service again, type:</p><pre>sudo systemctl restart myproject</pre><p>To check the status of the application:</p><pre>sudo systemctl status myproject</pre><h3>Logs</h3><p><strong>Application Logs</strong></p><p>/home/maanav/myproject/myproject.log: Every application request is recorded is in this log file.</p><p><strong>Server Logs</strong></p><p>/var/log/nginx/access.log: Every request to your web server is recorded in this log file unless Nginx is configured to do otherwise.<br>/var/log/nginx/error.log: Any Nginx errors will be recorded in this log.</p><h3>Conclusion</h3><p>In this guide, you created and secured a simple Flask application within a Python virtual environment. You created a WSGI entry point so that any WSGI-capable application server can interface with it, and then configured the uWSGI app server to provide this function. Afterward, you created a systemd service file to automatically launch the application server on boot.</p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on Twitter or Facebook. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2a47f378c3d2" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/deploy-flask-applications-with-uwsgi-and-nginx-on-ubuntu-18-04-2a47f378c3d2">Deploy Flask Applications With uWSGI and Nginx on Ubuntu 18.04</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Dog Breed Classifier using CNN]]></title>
            <link>https://medium.com/@maanavshah/dog-breed-classifier-using-cnn-f480612ac27a?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/f480612ac27a</guid>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Sat, 25 Apr 2020 09:23:02 GMT</pubDate>
            <atom:updated>2020-07-03T17:39:00.660Z</atom:updated>
            <content:encoded><![CDATA[<p>Imagine you are having your weekend jog/walk in the park, you see a really cute dog. Have you ever wondered which breed the dog belonged to? I have…</p><p>There are 266 individual breeds of dog pictured on the website <a href="https://dogtime.com/dog-breeds/profiles">dog time</a>. If you are like me, you would be able to identify not more than 10–15 of the breeds.</p><p>So, when I was given a choice of a few different projects for the Data Scientist Nanodegree by Udacity, I chose the ‘Dog Breed Classifier Project’. This is a very popular project across machine learning and artificial intelligence Nanodegree programs offered by <a href="https://www.udacity.com/">Udacity</a>.</p><h3>Overview</h3><p>The aim of the project in the Data Scientist Nanodegree was to create a web application that is able to <strong>identify a breed of dog</strong> if given a photo or image as input. If the photo or image contains a human face (or alien face), then the application will return the breed of dog that most resembles this person.</p><p>The project uses Convolutional Neural Networks (CNNs)! A pipeline is built to process real-world, user-supplied images. Given an image of a dog, the algorithm will identify an estimate of the canine’s breed. If supplied an image of a human, the code will identify the resembling dog breed.</p><p>The steps that were followed to work through the project were the following:</p><ul><li>Step 0: Import Datasets</li><li>Step 1: Detect Humans</li><li>Step 2: Detect Dogs</li><li>Step 3: Create a CNN to classify Dog Breeds (from scratch)</li><li>Step 4: Use a CNN to classify Dog Breeds (using Transfer Learning)</li><li>Step 5: Create a CNN to classify Dog Breeds (using Transfer Learning)</li><li>Step 6: Write an algorithm</li><li>Step 7: Test algorithm</li></ul><p>In this project, I have experimented with both Keras and Fast.AI to build the Convolutional Neural Network (CNN) to make the dog predictions.</p><p>I have set myself a target test accuracy for the CNN of 90% i.e., the model identifies the dog breed 9 times out of 10 correctly. We will be using the accuracy metric on the testing dataset to measure the performance of our models.</p><p>To follow along with the steps you can download or clone the notebook from my <a href="https://github.com/maanavshah/dog-breed-classifier">GitHub</a> repository. The repository features the ‘dog_breed_classifier.ipynb’ that runs on the GPU provided for free at Google Colab.</p><p><strong>Step 0: Import Datasets</strong></p><p>The datasets were provided by Udacity.</p><ul><li>Dog Images — The dog images provided are available in the repository within the Images directory further organized into the train, valid and test subfolders</li><li>Human Faces — An exhaustive dataset of faces of celebrities have also been added to the repository in the lfw folder</li><li>Haarcascades — ML-based approach where a cascade function is trained from a lot of positive and negative images, and used to detect objects in other images. The algorithm uses the Haar frontal face to detect humans. So the expectation is that an image with the frontal features clearly defined is required</li><li>Test Images — A folder with certain test images have been added to be able to check the effectiveness of the algorithm</li><li>Pre-computed features for networks currently available in Keras (i.e. VGG19, InceptionV3, and Xception) will be made available from S3</li><li>any other downloads to ensure the smooth running of the notebook are available in the repository.</li></ul><p>Load all the libraries and packages required through the notebook.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*cy9Rxg_wTpxJMRBs.png" /></figure><p>The libraries required can be categorized as follows:</p><ul><li>Utility libraries — random (for random seeding), timeit (to calculate execution time),os, pathlib, glob(for folder and path operations), tqdm (for execution progress), sklearn (for loading datasets), requests and io (load files from the web)</li><li>Image processing — OpenCV (cv2), PIL</li><li>Keras and Fastai for creating CNN</li><li>Matplotlib for viewing plots/images and Numpy for tensor processing</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/0*ykVxIzVc-Xk5XOkq.png" /></figure><p>Use the load dataset function from sklearn to import our datasets for our dog breed model training. Create the list of training, validation and test sets of filenames and the dog breed labels. Create a few paths that will be used later.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*tK2ul-6W0Bd-3dSD.png" /></figure><p><strong>Dataset stats</strong></p><p>The dog_names variable stores a list of the names for the classes to use in our prediction model. Based on the path name, we see a total of 8351 images of dogs belonging to 133 different dog breeds which are then categorized into 6680, 835, and 836 images in training, validation, and testing.</p><h3>Step 1: Detect Humans based on OpenCV Haar cascade classifiers</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MNrLlE45PiaSOpaK.png" /></figure><p>Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, “<a href="https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf">Rapid Object Detection using a Boosted Cascade of Simple Features</a>” in 2001. It is a machine learning-based approach where a cascade function is trained from a lot of positive and negative images, which is then used to detect objects in other images.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/842/0*i62QezreWHb-j0zz.png" /></figure><p>We use OpenCV’s implementation of Haar feature-based cascade classifiers to detect human faces in images. OpenCV provides many pre-trained face detectors, stored as XML files on Github. Before using any of the face detectors, it is standard procedure to convert the images to grayscale. The detectMultiScale function executes the classifier stored in face_cascade and takes the grayscale image as a parameter. The face_detector function takes a string-valued file path to an image as input and returns True if a human face is detected in an image and False otherwise. While testing the human face detector, all 100 human faces were detected as human faces while 11 of the 100 dog faces were also detected as human faces</p><h3>Step 2: Detect Dogs</h3><p>Here, we use a pre-trained <a href="http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006">ResNet-50</a> model to detect dogs in images. Our first line of code downloads the ResNet-50 model, along with weights that have been trained on <a href="http://www.image-net.org/">ImageNet</a>, a very large, very popular dataset used for image classification and other vision tasks. ImageNet contains over 10 million URLs, each linking to an image containing an object from one of <a href="https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a">1000 categories</a>. Given an image, this pre-trained ResNet-50 model returns a prediction (derived from the available categories in ImageNet) for the object that is contained in the image.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/615/0*aUpPDe5jIqoXITau.png" /></figure><p>When using TensorFlow as backend, Keras CNNs require a 4D array (which we’ll also refer to as a 4D tensor) as input, with shape</p><p>where nb_samples corresponds to the total number of images (or samples), and rows, columns, and channels correspond to the number of rows, columns, and channels for each image, respectively.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*mDhwahGRyz5FQByp.png" /></figure><p>Create tensor input from paths to images</p><p>The path_to_tensor the function takes a string-valued file path to a color image as input and returns a 4D tensor suitable for supplying to a Keras CNN. The function first loads the image and resizes it to a square image that is 224×224 pixels. Next, the image is converted to an array, which is then resized to a 4D tensor. In this case, since we are working with color images, each image has three channels. Likewise, since we are processing a single image (or sample), the returned tensor will always have shape (1, 224, 224, 3).</p><p>The paths_to_tensor the function takes a NumPy array of string-valued image paths as input and returns a 4D tensor with shape (nbsamples, 224, 224, 3). Here, nb_samples is the number of samples, or a number of images, in the supplied array of image paths. It is best to think of nb_samples as the number of 3D tensors (where each 3D tensor corresponds to a different image).</p><p>In addition, ResNet-50 requires additional processing such as reordering of channels from RGB to BGR and normalization of pixels which is done using preprocess_input.</p><p>The model is then used to extract the predictions. The predict method, returns an array whose 𝑖-th entry is the model&#39;s predicted probability that the image belongs to the 𝑖-th ImageNet category. This is implemented in the ResNet50_predict_labels function below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/807/0*bw4nBvM0AETxhI7d.png" /></figure><p>The categories corresponding to dogs appear in an uninterrupted sequence corresponding to keys 151–268, inclusive, to include all categories from &#39;Chihuahua&#39; to &#39;Mexican hairless&#39;. So, if the function returns any number between 151 to 268, the supplied image is that of a dog.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/945/0*hy1E5snRcFi_ZL-m.png" /></figure><p>The dog_detector function above, returns True if a dog is detected in an image (and False if not). None of the samples of human images have a detected dog as expected and all sample images of dogs have a detected dog as expected.</p><h3>Step 3: Create a CNN to Classify Dog Breeds</h3><p>The model that I selected had a CNN architecture of 4 convolutional layers alternating with max-pooling layers, 10% dropout and batch normalization. The filters used were 16, 32, 64 and 128. The drop-outs were used to reduce the possibility of over-fitting.</p><p>This is then followed by a global average pooling layer which is then followed by a dense layer to identify 133 breeds.</p><p>This takes a 4D-tensor with shape (1, 224, 224, 3) and provides an array of 133 with probabilities. The optimizer used was ‘RMSProp’ and the metric used was accuracy. The model was run for 10 epochs and provided an accuracy of 6.69%</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/852/0*TO5LgtBvOMxXHDn5.png" /></figure><p>CNN model from scratch</p><h3>Step 4: Use a CNN to Classify Dog Breeds</h3><p>I used VGG16 to demonstrate the use of Transfer Learning. Bottleneck features is the concept of taking a pre-trained model and chopping off the top classifying layer, and then providing this “chopped” VGG16 as the first layer into our model.</p><p>The bottleneck features are the last activation maps in the VGG16, (the fully-connected layers for classifying has been cut off) thus making it now an effective feature extractor. The bottleneck features were obtained from a website where it’s stored as a .npz file using the BytesIO library along with requests for the URL extraction.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*EVEYiq3i70XPyT4R.png" /></figure><p>The pre-trained VGG-16 model was then used as a fixed feature extractor, where the last convolutional output of VGG-16 is fed as input to our model. The shape of the VGG16 pre-trained model was 6680, 7, 7, 512 i.e. a layer of (7,7,512) with 6680 samples. A global average pooling layer and a fully connected layer, where the latter contains one node for each dog category and is equipped with a softmax. Running this model for 20 epochs resulted in an increase in the accuracy of 47%. This demonstrates the benefit of leveraging Transfer Learning from pre-trained models.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/955/0*WfTwLgKSVMAH4iqY.png" /></figure><h3>Step 5: Create a CNN to Classify Dog Breeds (using Transfer Learning)</h3><p>The model was built using Keras leveraging Transfer Learning. I tried with 4 different models VGG19, ResNet50, InceptionV3, and Xception.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*6ufthOs4Q6k-B1bQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*pJKDHowoJ1jXM3Uz.png" /></figure><p>The shapes correspond to VGG19: (6680, 7, 7, 512) Resnet50 : (6680, 1, 1, 2048) Inception: (6680, 5, 5, 2048) Xception : (6680, 7, 7, 2048). It took about 160 seconds to load all the Transfer learning models.</p><p>These models were then added with a global average pooling layer, a dropout layer followed by a fully connected layer (with softmax) and then run for 20 epochs</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Xu9ppBwm7EZcAisO.png" /></figure><p>Training the models took less than a minute in each of these cases.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/385/0*AXoyaJOCrtEf19Jm.png" /></figure><p>Training Time in seconds</p><p>Accuracy for Xception was ~85% while VGG19 was ~46%</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/460/0*VekTrnlNS1jRuTTa.png" /></figure><p>I then explored options for increasing the accuracy. I used fastai to see if we could leverage transfer learning and obtain a higher accuracy.</p><p>The data bunch was created and normalized.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/795/0*GQ22UzGMsYTDzDmC.png" /></figure><p>A cnn_learner was created with the resnet34 model and was run for two cycles. The accuracy was upto 86%. An optimal learning rate seems to be between 1e-6 and 1e-4</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*EZNN_YPRDY96Gi5X.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1020/0*o3l6jGfKmaB0c9Pt.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/815/0*W54ZSVqidCmn6n70.png" /></figure><p>After using unfreeze and refitting the model and for 10 epochs an accuracy of up to 89.8% is also obtained that ensures up to 9 out of 10 images are accurately classified.</p><p>Based on the analysis of various models that we have fit, the learn_resnet34 seems to provide the most accuracy. This is also saved and exported as a pickle file for classification.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*7iC3cFrasl3kR23-.png" /></figure><h3>Step 6: Write own algorithm to provide an output breed based on an image</h3><p>We input an image path, the bottleneck features for our pretrained model are applied to the image, this is then processed through our trained fully-connected model which gives a predicted_breed, the category index, and the probability tensor. The predict_breed function takes an input of a file_path and outputs the breed of the dog.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*-vp89gGdM0vS1Rsu.png" /></figure><p>Our algorithm accepts a file path to an image and first determines whether the image contains a human, dog, or neither. Then,</p><ul><li>if a <strong>dog</strong> is detected in the image, return the predicted breed.</li><li>if a <strong>human</strong> is detected in the image, return the resembling dog breed.</li><li>if <strong>neither</strong> is detected in the image, provide an output that indicates an error.</li></ul><p>The algorithm leverages the CNN built in Stage 5 and leverages the previous functions created to come up with an output.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/740/0*VNzUBtetW2wtL9by.png" /></figure><p>The algo function determines if the provided file_path contains a dog or human or neither and returns the species (dog or human or neither) and the predicted breed of the image</p><p>The provide_output outputs a greeting based on the predicted species and dog breed.</p><h3>Step 7: Test Your Algorithm</h3><p>The six dogs that were sampled to check the algorithm were correctly identified as dogs. The breeds of 5 of 6 were accurate too. Only 1 dog (a Rajapalayam, a native breed was identified as a Great Dane, possibly because Rajapalayam is not one of the 133 breeds in the ImageNet dataset.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/447/0*hdoYzxoDhpnqUS5d.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/365/0*dhLaB-O1mwEcBp0S.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/397/0*u3R2CY2Zb7CWUCeA.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/360/0*_C7u1YAS7jOShovq.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/355/0*_jgKytG_5xwfAAI-.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/345/0*t9VYpoEJIGjNWtdN.png" /></figure><p>The humans were also identified as human and a dog breed predicted — incidentally both were predicted as Dogue_de_bordeaux</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/430/0*Al4NGn2P8uZmma5f.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/405/0*zZw-WeHD3wcqd9tM.png" /></figure><h3>Reflection</h3><p>At the start, my objective was to create a CNN with <strong>90%</strong> testing accuracy. Our final model obtained <strong>89.8%</strong> testing accuracy.</p><p>There are a few breeds that are virtually identical and are sub-breeds. There’s also a possibility of some images being either blurred or having too much noise. There’s also a possibility of enhancing the quality by additional image manipulation.</p><p>Following the above areas, I’m sure we could increase the testing accuracy of the model to above 90%.</p><p>A simple web application in Flask could be built to leverage the model to predict breeds through user-input images.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f480612ac27a" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Fixing python import error “No module named appengine”]]></title>
            <link>https://medium.com/@maanavshah/fixing-python-import-error-no-module-named-appengine-ebcb540e7f18?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/ebcb540e7f18</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[app-engine]]></category>
            <category><![CDATA[google-cloud-platform]]></category>
            <category><![CDATA[importerror]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Fri, 22 Nov 2019 14:55:59 GMT</pubDate>
            <atom:updated>2019-11-22T14:55:59.649Z</atom:updated>
            <content:encoded><![CDATA[<p>Have you ever tried importing any package from Google App Engine library?</p><p>If it throws an import error, it means that either the App Engine SDK is not installed, or at least the Python run-time cannot find it.</p><p>For example,</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2AjGWARhTXeMI98KXAuLWQ.png" /></figure><p>You can check if the google <em>__path__ </em>is correctly linking to the App Engine SDK. For example, in my case, it is pointing to site-packages in the virtual environment.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/976/1*0gt0ovFrdzMJYruxkEQseA.png" /></figure><p>You can execute the following steps in order to solve the ImportError:</p><ol><li><strong>Install App Engine SDK</strong><br>Ensure App Engine is correctly installed on your system. Read and follow the instructions here:<br><a href="https://cloud.google.com/appengine/downloads#Google_App_Engine_SDK_for_Python">SDK for App Engine</a></li><li><strong>Install Google pip packages<br></strong>You should have the <strong>google-api-core</strong> package installed. You can install it using the following command:<br><em>pip install google-api-core</em></li><li><strong>Configure Google <em>__path__ </em>in python shell<br></strong>Following snippet can configure the google.__path__ to App Engine SDK:</li></ol><pre>import google<br>import sys</pre><pre>gae_dir = google.__path__.append(&#39;/path/to/appengine_sdk/google_appengine/google&#39;)<br>sys.path.insert(0, gae_dir) # might not be necessary</pre><pre>import google.appengine # now it&#39;s on your import path`</pre><p>For example, you can see the google <em>__path__ </em>correctly points to App Engine SDK, and we can also import the library.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*d_GhnOrnT2e9noEUifhDdQ.png" /></figure><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on Twitter or Facebook. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ebcb540e7f18" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Uploading multiple files to Google Cloud Storage with Python]]></title>
            <link>https://medium.com/swlh/uploading-multiple-files-to-google-cloud-storage-with-python-7780aefa1569?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/7780aefa1569</guid>
            <category><![CDATA[google-cloud-storage]]></category>
            <category><![CDATA[google-cloud-platform]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[angular]]></category>
            <category><![CDATA[file-upload]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Mon, 19 Aug 2019 14:30:54 GMT</pubDate>
            <atom:updated>2019-09-20T11:46:24.529Z</atom:updated>
            <content:encoded><![CDATA[<p>You can use <a href="https://cloud.google.com/storage/"><strong>Google Cloud Storage</strong></a> in your Google App Engine applications to upload, store and serve files. These files can directly be uploaded to Google Cloud Storage.</p><p>Let’s see how to implement this in a simple Python and AngularJS application.</p><p>The steps are pretty easy as is the code.</p><h4>Activate the Google Cloud Storage and install the client library</h4><p>Activate Google Cloud Storage for your project by selecting the option in the Google Developers Console. <br>You can have a look at this <a href="https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/activate"><strong>introduction</strong></a> for a list of steps.<br>At the end of the process, you will have a <em>bucket </em>for your project. The bucket is a sort of virtual folder in GCS and is the place where the uploaded files will be stored and read. <br>You also need to install the python <a href="https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/setting-up-cloud-storage"><strong>Storage library</strong></a> in your project, the library contains the module that you will import in your backend code to work with GCS. For this, you need to set up a lib folder in your Google cloud project using:<br><a href="https://cloud.google.com/appengine/docs/standard/python/tools/using-libraries-python-27">https://cloud.google.com/appengine/docs/standard/python/tools/using-libraries-python-27</a></p><p>The following command allows us to install the storage and authentication dependencies in the lib folder.</p><pre>pip install -t lib google-cloud-storage oauth2client GoogleAppEngineCloudStorageClient</pre><p>You also need to provide the correct access to the storage bucket. These will come under <em>roles/storage. </em>You can check out the storage <a href="https://cloud.google.com/storage/docs/access-control/iam-roles#legacy-roles">legacy roles</a> and provide correct access to the bucket.</p><h4>Create the upload page</h4><p>In my code, I created an HTML page to upload the files. The page includes a simple select button to select the multiple files to upload and some extra-code to show the upload progress and result.</p><pre>&lt;div ng-app=&quot;uploadModule&quot; ng-controller=&quot;uploadController&quot;&gt;<br>  &lt;h1&gt; Upload Files &lt;/h1&gt;<br>  &lt;form&gt;<br>    &lt;input type=&quot;file&quot; id=&quot;files&quot; name=&quot;files[]&quot; multiple/&gt;<br>    &lt;button type=&quot;submit&quot; onclick=&quot;angular.element(this).scope().upload_file(this)&quot;&gt;<br>      Upload<br>    &lt;/button&gt;<br>    &lt;div&gt;<br>      &lt;br&gt;<br>      {{ upload_status }}<br>      &lt;br&gt;<br>    &lt;/div&gt;<br>  &lt;/form&gt;<br>&lt;/div&gt;</pre><p>The AngularJS controller (<em>UploadCtrl) </em>for the page is in controller.js. The controller implements a callback function for the select button using the <em>Upload </em>object.</p><p>What is important here is to specify the URL that will process the uploaded files — in my case api/file/upload — this should contain the backend code that responds to the POST requests containing the image data, and saves them in the Google Cloud Storage.</p><pre>var uploadModule = angular.module(&quot;uploadModule&quot;, []);</pre><pre>uploadModule.controller(&quot;uploadController&quot;, [&quot;$scope&quot;, &quot;$http&quot;, &quot;$window&quot;, function ($scope, $http, $window) {</pre><pre>$scope.upload_status = &quot;STATUS: Please select files. &quot;;</pre><pre>$scope.upload_file = function (e) {<br>    $scope.upload_status = &quot;STATUS: Uploading ...&quot;;</pre><pre>var formdata = new FormData(); //FormData object<br>    var fileInput = document.getElementById(&#39;files&#39;);</pre><pre>var selectedFiles = fileInput.files.length;<br>    if (selectedFiles &lt; 1) {<br>      alert(&#39;Please select files!&#39;)<br>    } else {<br>      //Iterating through each files selected in fileInput<br>      for (i = 0; i &lt; fileInput.files.length; i++) {<br>        console.log(fileInput.files[i].name)<br>        <br>        //Appending each file to FormData object<br>        formdata.append(&#39;files[]&#39;, fileInput.files[i], fileInput.files[i].name);<br>    }</pre><pre>      // Creating an XMLHttpRequest and sending<br>      var xhr = new XMLHttpRequest();<br>      var url = encodeURI(&quot;/api/file/upload/&quot;);<br>      xhr.open(&#39;POST&#39;, url);<br>      xhr.send(formdata);<br>      xhr.onreadystatechange = function (one) {<br>        if (xhr.status == 200) {<br>          console.log(&quot;Success&quot;);<br>          $scope.upload_status = &quot;Status: Upload successful.&quot;;<br>          $scope.$apply();<br>        } else {<br>          console.log(&quot;Error&quot;);<br>          $scope.upload_status = &quot;Status: Error while uploading.&quot;;<br>          $scope.$apply();<br>        }<br>      }<br>    }<br>  };<br>}]);</pre><h4><strong>Create the service account for external page authentication</strong></h4><p>We need to create a service account which we will use for external authentication. This allows us to remove the dependency on google auth by simply providing the path to the JSON file. You can use this link to do so <a href="https://cloud.google.com/video-intelligence/docs/common/auth">https://cloud.google.com/video-intelligence/docs/common/auth</a>. This will help us in setting up a service account for external page access.</p><h4>Create the backend code for storing the files</h4><p>Now, we create the python code for the backend that receives uploaded files and stores them in the Google Cloud Storage.</p><p>Change your app.yaml to create an endpoint for the upload page created in step 2. In my code this is api/image/upload and as we saw, this is the URL used by the <em>upload</em> function in the controller.</p><p>Then, define a request handler for a POST request to that address (I use webapp2 as a web application, but the same concepts apply also to Django or Flask):</p><p>We need to import the following libraries.</p><pre>from google.cloud import storage<br>from google.oauth2 import service_account</pre><p>Following is the FileUpload class which will upload all the files to Google Cloud Storage.</p><pre>class FileUpload(webapp2.RequestHandler):<br>    &#39;&#39;&#39;Handles Upload requests.&#39;&#39;&#39;<br>    def post(self):<br>        response = {}<br>        try:<br>            files = self.request.POST<br>            file_path = os.path.join(os.path.dirname(__file__),<br>                                     &#39;path_to_service_account.json&#39;)<br>            credentials = service_account.Credentials.from_service_account_file(<br>                file_path)<br>            storage_client = storage.Client(<br>                credentials=credentials, project=&#39;project-name&#39;)<br>            bucket = storage_client.get_bucket(&#39;bucket-name&#39;)</pre><pre>            for file in files.values():<br>                filename = file.filename<br>                file_blob = bucket.blob(filename)<br>                file_blob.upload_from_file(file.file)<br>            response[&#39;success&#39;] = True<br>        except Exception as ex:<br>            logging.error(&#39;Error while uploading upi logs: %s&#39;, ex)<br>            response[&#39;message&#39;] = str(ex)<br>            response[&#39;success&#39;] = False<br>        self.response.content_type = &quot;application/json&quot;<br>        self.response.write(json.dumps(response))</pre><h4>Setting up Sockets API</h4><p>But, as we are going to allow this page to be accessed externally, without the Google Authentication(<a href="https://developers.google.com/identity/protocols/OAuth2">Google OAuth</a>2) we need to set up the sockets API. This enables us to remove the dependency on google accounts and google auth. We can simply use a separate authentication library or create one of our own if need be.</p><p>We need to use the Sockets API for this purpose, which consists of adding the following code to your app.yaml file:</p><pre>env_variables:<br>  GAE_USE_SOCKETS_HTTPLIB : &#39;true&#39;</pre><pre>libraries:<br>- name: ssl<br>  version: latest</pre><p>We need to add this as its a problem of urllib3 with Google App Engine. As you can see in <a href="https://urllib3.readthedocs.io/en/latest/advanced-usage.html#google-app-engine">this link</a>. Once you added these lines, it will allow the access of the page directly using Sockets API.<br>You have the <a href="https://cloud.google.com/appengine/docs/standard/python/sockets/">official google Sockets Python API docs here</a>.</p><p>After that, you can just redeploy and connections should work.</p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on Twitter or Facebook. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7780aefa1569" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/uploading-multiple-files-to-google-cloud-storage-with-python-7780aefa1569">Uploading multiple files to Google Cloud Storage with Python</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Change the default MySQL Data Directory in Linux]]></title>
            <link>https://medium.com/@maanavshah/change-the-default-mysql-data-directory-in-linux-43b813b48e46?source=rss-4ae26f7ee9c------2</link>
            <guid isPermaLink="false">https://medium.com/p/43b813b48e46</guid>
            <category><![CDATA[default-directory]]></category>
            <category><![CDATA[mysql]]></category>
            <category><![CDATA[command-line]]></category>
            <category><![CDATA[change-default-directory]]></category>
            <category><![CDATA[linux]]></category>
            <dc:creator><![CDATA[Maanav Shah]]></dc:creator>
            <pubDate>Thu, 01 Aug 2019 08:20:43 GMT</pubDate>
            <atom:updated>2019-08-01T08:20:43.946Z</atom:updated>
            <content:encoded><![CDATA[<p>After installing MySQL database for a production server, we may want to change the default data directory of MySQL to a different directory. This is the case when such directory is expected to grow due to high usage. Otherwise, the filesystem where /var is stored may collapse at one point causing the entire system to fail. Another scenario where changing the default directory is when we have a dedicated network share that we want to use to store our actual data. MySQL uses /var/lib/mysql directory as default data directory for Linux based systems.</p><p>In order to change the default directory, we need to check the available storage. We can use the df command to discover drive space on Linux. The output of df -H<em> </em>will report how much space is used, available, the percentage used, and the mount point of every disk attached to your system.</p><p>We are going to assume that our new data directory is /mnt/mysql-data. It is important to note that this directory should be owned by mysql:mysql.</p><pre>mkdir -p /home/mysql-data</pre><p>For simplicity, I’ve divided the procedure into 4 simple steps.</p><h3>Step 1: Identify Current MySQL Data Directory</h3><p>To identify the current data directory use the following command.</p><pre>mysql -u username -p -e “SELECT @@datadir”</pre><p>We need to identify the current MySQL data directory as it can be changed in the past. Let’s assume the current data directory is /var/lib/mysql</p><h3>Step 2: Copy MySQL Data Directory to the desired location</h3><p>To avoid data corruption, stop the service if it is currently running before proceeding and check the status.</p><pre>service mysqld stop<br>service mysqld status</pre><p>Then copy recursively the contents of /var/lib/mysql to /mnt/mysql-datapreserving original permissions and timestamps:</p><pre>cp -rap /var/lib/mysql/* /mnt/mysql-data</pre><p>Change the permission of the directory as its owner should be mysql:mysql. We can use the following command to change the ownership of the directory:</p><pre>chown -R mysql:mysql /mnt/mysql-data</pre><h3>Step 3: Configure the new MySQL Data Directory</h3><p>Edit the MySQL default configuration file <strong>/etc/my.cnf</strong> and update values of <strong>mysqld</strong> and <strong>client</strong>.</p><pre><em># Change From:</em>[mysqld]<em><br></em>datadir=/var/lib/mysql<br>socket=/var/lib/mysql/mysql.sock<em># Change To:</em>[mysqld]<br>datadir=/mnt/mysql-data/mysql<br>socket=/mnt/mysql-data/mysql/mysql.sock</pre><p>If there is no <strong>client </strong>variable then add, or else, update it to:</p><pre>[client]<br>port=3306<br>socket=/mnt/mysql-data/mysql.sock</pre><p><strong>Step 4: Enable the MySQL Service and confirm the directory change</strong></p><p>Restart the MySQL service using the following command:</p><pre>service mysqld start</pre><p>Now, use the same command to verify the location change of the new data directory:</p><pre>mysql -u username -p -e “SELECT @@datadir”</pre><p>If you face any issue during MySQL startup check MySQL log file <strong>/var/log/mysqld.log</strong> for any errors.</p><p>That’s it. Hope this helps.</p><p><em>If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend or sharing it on Twitter or Facebook. Thank you!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=43b813b48e46" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>