My 100 Days : First side project

Houssem Masri
4 min readJun 20, 2020

--

Search by video , my first side project . Part of my 100 days journey towards getting a job . check out the live version and the source code

screenshot of the project

The idea :

Search by video is a search engine that takes video and return results . I got the idea when i was looking for something similar but all i found was a reddit replies suggesting taking screenshots and using them in google’s image reverse searching . But , for me that wasn’t practical nor efficient . So i decided to build my own automated , practical, efficient version .

The stack :

Mern stack , and Heroku for the deployment .

What’s happening behind the scene ?

1- Uploading the video to the server:

You got two options :

1- Paste the direct link to the video and press search button . The link gets sent to the server as post request . Then , i used node download helper to download the video using that link.

2-Upload the video from your own device : you press choose file and then you select the video and then the video gets sent as formData . on the server side , i used express file upload.

Then the video gets checked to make sure that the uploaded file is a video , and to make sure that the video’s size isn’t more than 200 mb. if not the file gets deleted .

2- Generating the screenshots :

For this i used ffmpeg ( used the ffmpeg buildpack on heroku) , and the node fluent ffmpeg .

const ffmpeg = require('fluent-ffmpeg');
const screenShotGen = async (filename,count = '13') => {
ffmpeg.ffprobe(`${__dirname}/../client/public/uploads/${filename}`,async (err) => {
console.log('Will generate screenshots');
const proc = new ffmpeg(`${__dirname}/../client/public/uploads/${filename}`)
.takeScreenshots({
filename:'%i',
count:count, // number of seconds
size:'960x540'
}, './thumbnails');
})

Where count is how many screenshots you take , the higher count number is the more accuracy you get . Size is the image size and it is set to 960 * 540 because of the performance , also the better size the better results.

3- Uploading the screenshots:

Now it uploads the screenshots to imgur api . Using axios , formData and bluebird’s asynchronous map function and then it returns the links out of the response to the links array , which get returned as well.

To keep things clean it deletes the uploaded video after the screenshots are generated , and deletes the screenshots after they are uploaded using fs.unlink() .

4- Scraping google’s search results :

It works like this : if you put a link of an image after “https://images.google.com/searchbyimage?image_url=” it’ll reverse search it . What i did was i mapped the list of images links from imgur using blue bird’s asynchronous map function to get the html pages using axios , then i used cheerio to retrieve the links from the search results.

const response = await axios.get("https://images.google.com/searchbyimage?image_url="+encodeURI(urls[i])+"&encoded_image=&image_content=&filename=&hl=en-US",{ headers: { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.0; rv:20.0) Gecko/20100101 Firefox/20.0'}  })
const $ = cheerio.load(response.data);
$('div.r').each((i,element)=>{
const link = $(element).find('a').attr('href')
results= [...results,link]
});

5-Sorting the links:

It sorts the links based on their occurrence . So , It basically gets the ones which repeat the most to the top , and the ones which repeat the least to the bottom . Finally , the results (list of links) are sent back to the front end.

let countof={};
results.forEach(function(item){
if(!countof[item]) countof[item]=0;
countof[item]++;//increase
});
const countResults =Object.keys(countof).map(link=>({link,count:countof[link]}));
countResults.sort((a,b)=>a.count-b.count);
let finalResults = countResults.map(url=>url.link,{} )
console.log(finalResults);
res.send(finalResults)
finalResults = []
}

6- Showing the results :

And now it’s time to show the results , for this i used react tiny link . A component that embed links .

7- Done !

Issues :

1-Heroku:

Since i’m on the free plan of Heroku I've got into some problems and had me on limits :

  • The performance : wasn’t as i wanted and could’ve been much better if i wasn’t on the free plan
  • Heroku request timeout : this was a nightmare to me since my app takes a long time to process . solved it but it wasn’t as clean as i wanted.

2-External modules and Asynchronous:

My webapp must work on asynchronous process . unfortunately , some of the external modules aren’t asynchronous . solved it as well by adding setTimeOut to a promise but this just didn’t seem clean to me .

The Lesson:

Photo by Fab Lentz on Unsplash

I’ve faced a lot of struggle and obstacles through out building this project , but that’s what taught me the joy of hardwork . this experience taught me that whenever you run into an issue all you need is to have some faith in your self and your abilities , to wait when you start telling yourself that you are done and you need to leave it behind , and to see the problem from another side.

if you’ve reached this point help me with a follow on twitter

--

--