YES! Snap
For those of you who keep abreast of current Australian events, the plebiscite for gay marriage has been a hot political topic for quite a while. Voting is currently underway and ballots are due at the ABS by November 7th. That means there’s a couple of weeks to go! So at jtribe, we decided to make a little app to show our support for marriage equality in our own way: putting together a fun little app which not only plays with some some interesting tech, but also touches upon our core values as software developers, namely impact, fairness and sustainability. This is the sort of engineering opportunity that we feel is well worth investing in.
The end result of development over the course of a week is YES! Snap, available on both the App Store and the Google Play Store. You can take a selfie of yourself with a rainbow moustache and share it with friends and family via Facebook, Twitter and so on. There’s also a collection of filters so you can flick between them and show your support for marriage equality with a number of different colours and stickers. Photos can also be saved to the device for the user to keep for later.
It’s a simple app and a simple concept, but it also show that as software engineers we have the ability to harness our creativity and ingenuity to empower not just ourselves, but others, and we can do so in a way that’s incredibly accessible to any technically-minded individual. Both versions were put together from design to deployment over the span of a mere few days using remarkable technology that’s freely available to anyone. The apps themselves are free to use. People can do what they like with the photos they take, and we don’t ask for anything in return.
The Tech
Speaking tech-wise, the core functionality of both the Android and iOS app is driven by Google’s Mobile Vision library. It provides all of the face detection functionality integral to putting moustaches on faces, quite literally in fact: using Mobile Vision, we can readily get the coordinates for the bottom of the nose and the bottom and sides of the mouth. You can also get rotation angles too, which are good for keeping the moustache from looking all weird and lopsided if you tilt your head, and if you were so inclined you could even use them to transform the moustache’s size and scale to make it appear like it’s turning as the user’s head tilts left and right.
There were alternatives we’d looked at of course, such as OpenCV — probably the most well-known computer vision library available. It offers a lot more along the lines of customisability and extensibility, and has a tremendous amount of documentation out there; just look at the sheer quantity of books listed on the website. However, the steepness of the learning curve presented by that library was a little too much given the time frame we wanted to get the app written and submitted in, and Mobile Vision represented an easy-to-use alternative — from an Android perspective at least, the Android developer I am — that didn’t require dealing with the quirks of JNI. That’s something I like to avoid if I can, and Mobile Vision has none of that. All of the native code is abstracted away behind Google Play stuff, for better or worse, so it’s straight Java all the way down.
With Mobile Vision’s face detection, the pipeline you work with is pretty simple:
- You’ve got some kind of camera source that consumes the camera feed and turns it into
Frame
s. - The Frames are then piped through to the detector and the appropriate landmarks are identified. In this case, we only want the aforementioned four: bottom of the nose, bottom of the mouth and the corners of the mouth. These four provide you with enough information to draw our moustache. Using the z-rotation too also lets you tilt the moustache so we don’t end up with it looking comically lopsided if we angle our head. This is done by implementing a
Tracker
and extracting the newFace
's information inonUpdate
. - By translating the coordinates of the landmarks to a scale suitable for our camera preview surface view, you can place the landmarks according to your camera preview.
- You then use all of this information to do whatever you fancy, like draw a rainbow moustache under someone’s nose.
Google’s example projects are pretty good at showing how this all works, and there’s even some old source code for the stock-standard CameraSource
floating around that clearly illustrates the process I’ve described above. They’re a nifty reference point for starting out, and that implementation of CameraSource I’ve linked is incredibly useful if you want to write your own.
As it happened, I went with the standard, totally-black-box CameraSource implementation as is provided by the library. It hadn’t seemed necessary at the time, but I think if I were to do things again I’d probably not use that implementation and write my own. There were some issues with it that took some rather creative thinking to work around in such a short period of time.
Anyway, time for a quick teardown.
How it works
If you install the app on your Android device, you’ll see that the app consists of four Activities. This was a result of aiming for speed and simplicity in development; rather than using Fragments or a third party solution such as Conductor, I just have a series of single-purpose Activities that do one thing each: splash screen; take a photo; apply stickers; share the photo.
This of course is not the most appealing way of structuring an app like this, and there’s a bit more bitmap caching than would otherwise be necessary that goes on in the background so that we can pass URIs between the different screens without running into binder transaction size issues. These issues, mostly in the form of random TransactionTooLargeException
s, can be a bit of a pain point if you don’t anticipate experiencing them before you architect the entry interface to your Activities. It didn’t make the app feel sluggish though, so I felt that trading away some imperceptible sliver of performance for ease of development was reasonably easy to justify.
Splash!
The splash screen is pretty straightforward, except for one thing: Mobile Vision has to fetch some dependencies in the background upon app install. And you don’t know how big the files are. This presents two difficulties:
- The user may have insufficient space to store the dependency on their device.
- There are cases where the dependency may not have been fetched already by the time the user launches the app.
As far as workarounds go, the first one is already reasonably solved; you can calculate the amount of space free on your device, compare it to some threshold and if it fails that test, present the user with a popup saying that they’ve probably gotta go delete some things. The second one is a bit more interesting.
The first thing you need to know is that you can create a FaceDetector
instance without hooking it up to anything and it will still be able to tell you if face detection features are available. By instantiating a detector and calling detector.isOperational()
, you can poll the detector to see whether we have the dependencies we need.
With this in mind, if you wanted to just use vanilla Android classes you could create an AsyncTask that polls every n seconds to see if detector.isOperational()
returns true and terminates upon succeeding, or if you’re using RxJava like I did, you could set up a little Observable like so:
Observable.interval(FACE_DETECTION_POLLING_PERIOD, TimeUnit.SECONDS)
.map(___ -> detector.isOperational());
and that’ll do that job for you once you subscribe to it. My full implementation is a bit more wordy because I do a check before I switchMap
to the above Observable and do a little bit of fluffy stuff afterwards to make sure the user knows what’s going on, but if all you genuinely care about is simply waiting for the face detection to become available this is absolutely sufficient.
Upon successfully determining that the face detection is available, we launch the camera Activity.
Snap!
This part of the app took up most of the time, since the limitations of using the Mobile Vision library without writing my own CameraSource implementation imposed some interesting constraints upon myself and what I could do without having to subject myself to the “do I feel bad about using Reflection?” mental kabuki dance. Namely, we don’t actually have easy access to the Camera! That thing resides in a private field inside the CameraSource, and we only have a very limited means of interacting with it; we can take photos and that’s about it. This means it’s tricky to get a nice snap of exactly what the user’s seeing in the preview.
So after the preliminary setup consisting of determining how many cameras are available, grabbing the first one and using that to set the flag we use when initialising the CameraSource (as per this lengthy Gist), we have a bit of an issue.
What we want is to have a full-screen preview, or as close to it as we can get, but we have to supply a preview size and therefore set an aspect ratio. I’d tried simply picking the preview size to fit the entirety of the space available, but sometimes that would not result in camera sizes that weren’t supported by the device and weird behaviour after trying to best-fit both the preview size and picture size, making the placement of the moustache in the resultant bitmaps difficult to handle. Cameras are hard and given the rather restrictive wrapper Mobile Vision provides, this wasn’t something I really wanted to deal with. Therefore I took an extremely simple approach:
- Make the preview size fit a fixed aspect ratio, such as 4:3. I accomplished this by making it fill an instance of a custom FrameLayout class I wrote that supports aspect ratios.
- Make it take up the entire height of the screen and
wrap_content
for the width. This involves fiddling with some custom view setup, but nothing particularly tricky. - Plonk that in a FrameLayout it can ‘draw’ outside the bounds of, centre it in case it’s not wide enough to fill the width of the screen, and make the surrounding FrameLayout
match_parent
for both width and height.
Viola. Known aspect ratio, full screen preview, and a wrapper layout we can use to crop the image to what we can see!
Taking a photo results in us doing quite a bit of work cropping and scaling the bitmap that we get given by the camera, but because we know the size of the wrapper layout we just need to scale things so they share the same height and then crop the sides off accordingly.
We do the very same thing for the moustache overlay too — it actually lives inside the layout that displays the camera preview feed, so they share the same size. All we need to do for that is just crop the sides as appropriate. There were a couple of gotchas I’d noticed, just quick ones:
- EXIF data is hard to get at unless you write the bitmap from the camera to disk first. You need that data to fix rotation issues, of which in Android-land there are many.
- Android front-facing cameras love to mirror the preview, so you must compensate for that.
Solving the first leads to solving the other for the most part, as long as you can determine whether you’re currently using the front camera. After that, we just go and pass through an URI to the next Activity, which is where we can choose our filters.
Sticker!
The sticker selection screen is almost offensively simple. We have a ViewPager that contains a bunch of views which are inflated from an array of Filter classes that contain two things — a layout resource ID and a name.
We use the name for tagging the ViewPager items as we inflate them, then use that tag to draw the chosen layout onto the bitmap. It’s a nifty trick, one that I’d recommend keeping in mind in case you ever want to take a snapshot of a ViewGroup.
private void drawFilter(Canvas c) {
int item = filterPager.getCurrentItem();
ViewGroup v =
filterPager.findViewWithTag(FILTERS[item].getName());
v.setDrawingCacheEnabled(true);
v.buildDrawingCache();
Bitmap overlay = v.getDrawingCache();
c.drawBitmap(overlay, 0, 0, new Paint());
v.setDrawingCacheEnabled(false);
v.destroyDrawingCache();
}
The moustache resides in an ImageView that we plonk on top of the ViewPager and can by turned on or off by the user. If they want to have their moustache in the final result, we use the drawing cache to fetch that and paint it on top of the selfie pic and chosen filter. We save the file to the app’s cache, then pass the URI to the final Activity.
Share!
This is the simplest part of the app, and took me all of two hours to put together. We get a preview of the final product, and, if the user lets us, we save it to the Picture
directory in external storage. The only gotcha is sharing the actual image, since in newer versions of Android they’re not exactly fond of you sharing files from the app cache directly. It’s so verboten that it’ll actually result in an exception being thrown in Android M and up, so you should use something like a FileProvider
. Some apps that handle the Intent may not behave properly, but honestly? The onus is on them to be up to speed with platform changes when it comes to that stuff. I’d rather they crash than me.
Most of ’em handle it just fine anyway!
In summary
That’s the app.
As software engineers, we’re creative individuals with the capacity to create meaningful software out of almost nothing. The prerequisites are simple; anyone fortunate enough to own a computer can put together something that’s meaningful or silly or anywhere in between — even both! And why wouldn’t we? Software is challenging and fun, and even bringing small ideas such as YES! Snap to fruition is something to be proud of.
About Us
At jtribe, we proudly create software for iOS, Android and Web and are passionate about what we do. We’ve been working with the iOS and Android platforms since day one, and are one of the most experienced mobile development teams in Australia. We measure success on the impact we have, and with over six-million end users we know our work is meaningful. This continues to be our driving force.