- Google "generative art". You'll find lots of images and you can usually work out roughly how they were made. Try to reverse engineer them and create your own facsimile.
Start with something that looks a bit like something you might want to make into art. One of the standard fractals, Perlin noise with a fixed seed, anything that's got a bit of randomness to it. Look for a section that's inspiring. Then look to see if you can tweak the logic to make that closer to what you want it to be.
I've been trying for weeks now to get a system running that can handle larger than RAM datasets and returns queries in an acceptable time. It's running ok now but far from optimal (size of DB is ~100 GB and it contains a few hundred million entries).
Does anyone here have experience with any implementations (such as likelike, lshkit, etc.) and can recommend something that can handle larger sets? All the implementations I have found were either not maintained, old, not running or not suitable for production use.
Will definitely take a look at the paper but unfortunately it's always a very long way from here to an actual implementation (there is no code published as far as I could see).
I've been playing with an implementation on top of lightning mdb[1]. Your profile doesn't have an email but feel free to email me if you're interested.
100GB isn't that big a deal. If you have at least 16GB of RAM it should be a breeze. There are much larger data sets in OpenLDAP in production around the world.
But I wouldn't choose python for large scale data processing work. The python CPU/memory overhead is like 100:1, compared to C. (This is why I worked on rtorrent and ditched the original bittorrent client ASAP, and why I hate bitbake....)
The biggest problem currently is actually degrading performance, although I'm almost 100% sure that this isn't caused by lmdb itself, but rather by the bindings I've tried.
In the end, doing it directly in C is probably the only thing that will actually work.
We are currently using perceptual hashes (e.g. phash.org) to do hundreds of thousands of image comparisons per
day.
As mentioned in another comment, you really have to test different hashing algorithms to find one that suits your needs best. In general though, I think it is in most cases not necessary to develop an algorithm from scratch :)
Indeed, the storage and retrieval of similar images is the hardest part. I do not know of a single networked open-source storage solution for this. I really wish that there was a project with a mindset of Redis, but for MVP trees.
By the way, may it be possible to implement MVP data structure in Redis, as the project is now? I can not think of possible replication issues with this, apart from the fact that one would have to pre-define a metric space for every tree.
The thing is though, you won't have difficulties finding papers on those topics. However, you will probably not have any luck finding many concrete and practical implementations that you could look at.
So it's a far way from reading the papers to having something working.
There's a list of CBIR's on Wikipedia and ammong those there are a few open source ones. I didn't really had time to check them all but during skimming through them imgSeek [2] caught my eye.
The really interesting part is actually to recognise how hard it is for a new app to enter my daily-use list. It's almost impossible. Some make it in there for a few days or weeks but will vanish quite soon.
Either I need the app for my daily work or it is a fire-and-forget service that I once signed up for and that doesn't require any active input from my site.
getmetricmail.com creates simple Google Analytics reports and sends them to you as a PDF.
Currently 3000 free users. A handfull pays, so it makes about $100 per month. A good example of Freemium gone wrong.
I've signed up to check it out. As Alex said, it's a good idea, nice design, etc.
Maybe the feature exists and I didn't spot it, but if you white labelled it so that web designers could send out branded messages to their clients as a value-add, I think you could see paid accounts pick up a bit.
Edit: OK, first email is in. Seems that I have to click a link or get an attachment - this is why I already avoid the Analytics mailed reports. Any chance you can just send the data in the email or does the API not permit it for some reason? I probably wouldn't use this going forward if I had to click through to something or open a PDF, I know that's sulky but just how it is.
You can receive pdf attachments directly, so you don't have to click on the link. Nevertheless that's not what you're looking for, I guess ;)
We thought about putting the data directly into an email, but the crappy HTML/CSS support in the gazillion email clients, make this a pretty tough job.
Yeah, I think Analytics does PDFs from memory and I cancelled all of those.
Can appreciate the frustration with HTML/CSS support - maybe if you kept your layout really simple and/or called it an Old School theme. Mostly I'd be looking for anything that quickly showed me if there was something wrong with a site (or right, e.g., major incoming link).
Wonder if you can throw in some marketing factoids like "Fourth straight month with an increase in traffic" or "Traffic growth continues; fourth straight month" - the sorts of things a marketing guy can repeat to the boss without any more time or research.
I haven't used your product so I may be saying nonsense, but why don't you also embed an image created from the PDF? HTML support may be crappy, but as far as I know most of them support images.
It's a great idea. Don't give up yet. Do you allow people to add multiple email addresses that the reports would be sent to. I imagine that would be v useful for businesses. For example, everyone in the marketing dept could receive a report each week.
The case here shows what happens when you offer too much features/resouces in the free plan.
The toughest part now is deciding if it makes sense to invest more time into it. But I guess that is a general problem for startups that haven't yet found product/market fit. You can't really know if you are miles or just an inch away from that fit.
I guess I'm missing something, but there is a setting in Google Analytics to send dashboard to an email address every week / month in PDF format. What's the difference here?