Kittydar: Detecting Edges in the World
From the Series: Digital Ontology
From the Series: Digital Ontology
How do we see ontological transformations? Take the case of kittydar, a small demonstration of machine-learning techniques in the area of computer vision. According to Heather Arthur, the project’s developer, “Kittydar is short for kitty radar. Kittydar processes an image (canvas) and calculates the locations of all the cats in the image.” This playful piece of software demonstrates the detection of edges in the mundane, not to say banal, cosmos of cat photos on the Internet. Given that cats matter in this cosmos, kittydar generates ontological statements. Arthur explains:
Kittydar first chops the image up into many “windows” to test for the presence of a cat head. For each window, kittydar first extracts more tractable data from the image’s data. Namely, it computes the Histogram of Orient Gradients descriptor of the image . . . This data describes the directions of the edges in the image (where the image changes from light to dark and vice versa) and what strength they are. This data is a vector of numbers that is then fed into a neural network which gives a number from 0 to 1 on how likely the histogram data represents a cat.
The neural network (the JSON of which is located in this repo) has been pretrained with thousands of photos of cat heads and their histograms, as well as thousands of noncats. See the repo for the node training scripts.
This toy example of machine learning finds cat heads in photographs, but we can easily locate similar techniques in use in self-driving cars, border security systems, military robots, gesture recognition systems, online advertising, supply chain logistics, as well as truly legion scientific applications. In these places, road signs, immigrants, the enemy, players, customers, and transactions function analogously to cats. Their popularization is striking and perhaps attests to an ontological gradient, a curvature that shifts and reshapes beings. Kittydar relies on a neural network, a typical machine learning technique that has been heavily developed by researchers at Google, themselves working on images of cats, among other things taken from YouTube videos (BBC 2012).
Several vectors orient the propagation of such techniques, whose results can be seen above:
Imitation, edges, and generalization together configure a system of transformations that may well have some ontological significance. The imitative vector follows existing patterns whose proliferation and profusion has created problems of scale: what to do with all the cat images on the Internet? One answer is to build machines that classify them so that they can be ordered, searched, or retrieved from amidst all the other images. The edge and shape-locating classification reconfigure the visible as a field of gradients whose variations can be traversed and marked as differences in the world. The errors attending this cosmological operation (for instance, kittydar identifies some human faces as cats) might amuse us, but they also engender renewed efforts to optimize the classificatory performance. Error is no obstacle, but like delinquency in Michel Foucault’s (1977) penal institutions, presents something absorbingly generative in its synthetic possibilities. Efforts to optimize errors also tend to generalize the techniques themselves. Each time a limit is addressed, each time a more complicated set of edges can be detected (cats side-on, for instance), the algorithm also lends itself to further generalization.
Such devices perhaps also diagram an ontological gradient, a surface whose variable curvature reshapes things—objects, subjects, fields, institutions—moving across it, by reframing the production of categorical statements. The ontological gradient of the probabilistic, whether in the neural net or support vector machine, densely weaves elements drawn from infrastructures, sciences, institutional practices (such as learning and training), and the plural transients of popular culture and media. Amidst very powerful vectors of normalization (the identification of cat faces does not, in itself, suggest any radical origination and may depotentialize alternative futures), what chance is there that different coherences, configurations of beings, and new edges might be articulated and become visible in the world? Could we learn from the difficulties that kittydar has in organizing its cosmos in order to negotiate that ontological gradient? To do so, we would need to imitate what kittydar does, finding new edges in the world and generalizing them in thought.
BBC. 2012. “Google Computer Works Out How to Spot Cats.” June 26.
Foucault, Michel. 1977. Discipline and Punish: The Birth of the Prison. Translated by Alan Sheridan. New York: Vintage.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. 1986. “Learning Representations by Back-Propagating Errors.” Nature 323, no. 6088: 533–36.