Minority Report Interface for Medium

Following my experiments from last week with using macros to hook a single key press on my laptop to open up a “new story” page on Medium, I’ve been taking the research a little deeper.

Okay, a lot deeper.

In Keyboard Maestro, one of the elements you can use to build your macro is mouse movements and actions. So this partial recipe triggered by keystroke shortcut while on Medium.com in a browser window:

Simulates mouse movement and clicks on the “Sign In/Sign Up” button, waits half a second and then mouses down and clicks to “Sign In/Sign Up By Email”…

From there (not depicted) it pastes in a pre-set email address, hits “Enter” and then logs into an email program, mouses to the habitual location of the first email in the list, clicks that title to open the email, and then clicks on the sign in link in the email.

Functional (if ugly) Medium login with one key press.

Prior to that more involved action, I also tested out a more simple macro, which works from the New Story page to click through the publish dialog. So that’s:

  1. One key to login
  2. One key to start a new story (you could combine this with step 1, in fact)
  3. One key to publish (if you’re willing to accept the suggested tags)

Right now, I’m passing the command signals to trigger these actions using a little mini-piano MIDI keyboard by Korg:

But you can just as easily re-assign your existing keyboard buttons to act as triggers, or plug in and set up a cheapo USB number pad to have the same effect:

Wondering how to take this joke even further, I started looking for other MIDI-enabled interface devices that might be pressed into service for surfing the web and copy/pasting like a pro.

This one seems very “Ocarina of Time” to me:

Like, conceivably you could set up a string of notes to open up a particular application or URL and perform some specific “magical” action on your computer.

For whatever reason, the vast majority of hackers and experimenters I’ve found in the world of custom MIDI controllers is musicians embracing electronic components. Imogen Heap had those famous gloves which appeared in a TED talk:

And Remidi is basically trying to Kickstarter exactly the same thing, a “wearable” glove, the actions of which trigger MIDI controllers. So conceivably, linking this back to macros, you could make a fist to sign into Medium (MIDIum?), open your palm facing downward to launch a new story, start typing and then flip your palm up when you’re ready to publish.

Or whatever chain of gestures you want.

Call me crazy, but I’d rather buy a product with a proven track record than risk somebody failing to deliver on a crowdfunded experiment. So Remidi is out, as far as I’m concerned.

Leap Motion is another controller working in this space that uses infrared to track hand movements within a given slice of space above the sensor:

On the one hand, ahem, this is cool because you don’t need to wear (and wear out) some strange glove and the sensor is relatively cheap (seen it for $70 CAD). Downsides include, as in photo above, that you have to keep your hands up above the sensor which I’ve seen Amazon reviews saying leads to fatigue, or what on the Gesture Recognition page of Wikipedia refers to as “Gorilla Arm”.

There are at least two applications for the Leap which allow you to program hand movements (no body gesturing available) to control MIDI triggers: AeroMIDI and Geco:

But there happens to be some weird “App Store” and launcher business that you get roped into when you buy a Leap — which I find utterly abhorrent and boring and out of date.

If you could (and you probably can) hack the Leap to just get the data out of it without all the app store business, this may actually be an interesting product — except that it points up and not forward (like the Kinect from Microsoft) and it only tracks hands.

Of course, how you are tracked by these devices matters a lot. For that reason, I really like the looks of DrumPants: basically a set of wearable MIDI triggers intended to make your leg drumming into something more serious:

Okay, so they don’t really look all that serious in the video. But still, their form factor is cool because it’s not a camera which can also be hacked to spy on you (not paranoid, that’s just a fact). It’s just a flexible set of triggers you can adapt to your body however you want, which then send a signal over Bluetooth.

The Bluetooth part I’m a little skeptical of, only because I have some Bluetooth speakers I bought for my computer which exhibit significant lag. Presumably MIDI note on/off signals are much smaller than sending actual audio, but I’d definitely have to see it to believe it.

In that same directoin, Machina markets a MIDI jacket for presumably “DJs” who want to look really really “cool”:

I could actually see that being kind of awesome (imagine if you could use it to trigger a bank of lasers to fire from your body — as a “use case”) but seeing as what I want is a gestural MIDI controller I can use for working not for “kickin’ it,” I can’t see wanting to constantly wear this bulky jacket to check my email, drop a link into Slack, close a browser tab, etc.


Going back to accessing your camera to (potentially) spy on you, there are also a couple of apps which use your laptop camera (no additional equipment purchase necessary) to recognize a limited set of hand gestures. See Flutter (bought by Google) and ControlAir:

Personally, I keep my laptop camera covered with a small square of black electrical tape, because I don’t want a camera in my face and in my house all day and all night.

But probably there is a way to access the control signals captured by either Flutter or ControlAir (haven’t checked) and pass them to MIDI, and hence pass them to macros — but from I gather they have only a handful of commands which can be recognized.

Which is weird to me, because I thought this field would have developed more over the last few years. Instead it seems to be the domain of a handful of companies selling technology they don’t know what to do with to people who don’t know really what to do with it either.

Anyway, I know what to do with it.

Approximately.

At the moment, I’m actually the most interested in the Kinect, because it frees you up to track full-body positions, not just your hands. Yes, it uses a camera (plus an infrared camera), but I’m more okay with a camera when it’s a device I can unplug and know for sure when it is Off.

And there is already a fairly developer Kinect hacker community. Especially arround Skanect, which enables you to do rough but cool 3D scanning output as a point cloud:

Over and above being able to Minority Report my Medium interface, I’m kind of curious about trying to apply this 3d-scanning technology in the garden, to make kind of moment to moment biome captures throughout the season.

Anyway, that’s another ball of wax entirely, but it makes the Kinect look a bit more interesting than the Leap which (or so I’ve read) can’t be pressed into service that way — though they have opened their API so you can access raw image data.

For the Mac, it appears you have to be careful which Kinect model number you buy. 1414 is supposed to be the good one, visible on the base — and its generally the older model, which I think (but can’t verify yet) is typically the Xbox 360 sensor version.

Unlike the Leap, the Kinect works by something called skeleton binding:

(Coincidentally, also the image a Terminator robot sees before killing you)

I’m simultaneously fascinated and a little creeped out by this. Based on my initial research it may also be potentially complicated to get this to work on the software side with Macs. I’ve found a bunch of different unfamiliar buzzwords attached to the subject of getting MIDI signals (or OSC — Open Show Control) out of a Kinect. I’ve yet to sort it all out as what is the proper path:

A bunch of research links:

See also: OpenNI, Processing, Max, OSCeleton and

Still surprises me how much in its infancy gesture recognition really is — at least for the Mac (I think some Kinect stuff works natively on Windows PC). Kinect was launched at the end of 2010, which in technology-years is like a biiiiiillion.

See also, MotionSavvy — apparently and off-shoot of Leap:

I’d say that maybe they have a ways to go on branding though:

“Killed by robots with machine vision? No way!”

Anyway, I’ll just leave you this last video as something to practice at your “standing desk” while you use macros keyed to physical gesture triggers to read, recommend, highlight and type into Medium using whatever set of signals you can manage to pass in:

All that without even touching the API!