Ob links

shrik
consonant (896763)
The Motherlode
The Cult
Alterslash

Archive/Search



Please direct questions and comments (both yours and /.!) to the email address on the /. user page.

Listed on BlogShares

Clicky

Monday, July 24, 2006

Porn in the U.S.A

by Motherfucking Shit (636021) Alter Relationship on 11:06 Thursday 05 May 2005 (#12438840)

Oh man, if only you'd posted two years ago, I could have likely hooked you up with the best porn organizing system ever developed. You haven't lived until you've built a categorizing system for hundreds of gigs of porn (hundreds of thousands of pics, tens of thousands of movies). 'Course having access to all that porn doesn't hurt either!

At the time I was a partner in a company that envisioned creating the mother of all porn sites. We were dealing with such an amount of content that we had to devise our own way of managing it. The solution I came up with was PHP/MySQL based. There was a database that held ungodly amounts of metadata about every file, and a "sorting" script that allowed some hired grunts to sit around all day literally reviewing thumbnails and sorting each image/movie by quality, assigning categories, etc.

A major HD death (of an unbacked-up HD*...) killed the project, and I've only got some snippets of the code left, and no database schema. As an example, though, here is some of the metadata we collected on every file - this code is not from the final revision, so some fields are missing...

For JPGs:
$result = mysql_query("INSERT INTO content SET name='$_POST[name]', owner=$_POST[content_provider], type='$_POST[type]', size=" . filesize("$_POST[origdir]/$filename") . ", height=$height, width=$width, md5sum='$md5sum', $cats, rawfile='$newpath/$filename', preview1='$preview1', preview2='$preview2', rate_internal=$_POST[quality], description='$_POST[description]', added=" . time(), $db);
For movies:
$result = mysql_query("INSERT INTO content SET moviethumb='$newpath/thumb.jpg', name='$_REQUEST[name]', owner=$_REQUEST[content_provider], type='$_REQUEST[type]', movietype='$_REQUEST[movietype]', moviesize_1=$filesize_1, moviesize_2=$filesize_2, moviesize_3=$filesize_3, movieheight_1=$height_1, movieheight_2=$height_2, movieheight_3=$height_3, moviewidth_1=$width_1, moviewidth_2=$width_2, moviewidth_3=$width_3, moviemd5_1='$md5sum_1', moviemd5_2='$md5sum_2', moviemd5_3='$md5sum_3', $cats, movierawfile_1='$newpath/speed1.wmv', movierawfile_2='$newpath/speed2.wmv', movierawfile_3='$newpath/speed3.wmv', preview1='$newpath/thumb.jpg', preview2='$newpath/thumb2.jpg', rate_internal=$_REQUEST[quality], description='$_REQUEST[description]', movieseconds_1='$movieseconds_1', movieseconds_2='$movieseconds_2', movieseconds_3='$movieseconds_3', added=" . time(), $db);
In both cases, "cats" was a string built earlier in the sorting process, like "cat1=5, cat2=17, cat3=40" ... Each catN corresponded to an entry in the "categories" table. So you could assign one pic/vid to categories "Asian," "Foot Fetish," "Brunette," whatever fit.

The movie sorter was really one of my better works ever - and it took forever to build. Unreviewed vids went into an "unsorted" dir. The person reviewing content would choose a vid to review. The script used mplayer to rip a few stills from the vid in realtime, the person sorting could choose a couple of stills to act as thumbnails. Once they assigned categories, description, quality, etc. the script would cut 3 different resolution versions (low, med, hi) of the vid, god I wish I still had all the code :/

One thing of vast importance to anyone building any sort of file tracking system - be it for porn, mp3s, whatever - is hashing. We used md5. Content providers would upload or send CD's full of zips or rars, and often times, the same set of pics would show up in multiple places. Being able to compare file hashes was essential in preventing duplicate content from going into the database. I guess for MP3s this would be a little harder, since you might have two copies of the same song ripped by different people...

Maybe someday I'll try to rebuild the DB schema and rewrite the missing portions of the code - which unfortunately is most of it. But if anyone else wants to write a porn organizer, maybe you picked up some pointers from this post...

*A single HD failure can ruin your entire business, by rendering work useless and breaking everyone's spirit to the point where they aren't willing to redo the work. Don't be a dumbass like my partners and I were. Make sure everything is backed up!
Comments: Post a Comment