Amazing, I completely missed the shill tracker.
Were you tracking md5 check sums even 1 changed pixel would generate a different hash.
So MD5 can be used as a first stage classification of obvious duplicates, but I think a CNN based image classifier is a must.
Also a sequential analysis of word count/phrasing would be good, since shills post short one liners, snide remarks, and don't have time to author organic posts with real argumentation.
On top of that I speculate you should factor in time of day, relative occurrence of similar pictures at time of posting, thread topic, and a few other parameters to further improve prediction.
Ideally I think generating a ghost profile for a suspected shill is helpful too. If the posting behavior has been tagged before, during the same shift, using similar images and phrasing etc.
At the end of the day, I wonder what relevance such a project would have today, when boards are fragmented.
It could make great memes to pass around reminding plebbitors, teens and newfags what is going on on the chans.
It could be expanded into providing interesting functions to the smaller boards, to give certain users a hidden "social credit score" to avoid infiltration in more serious projects.
Why I brought the topic up is I was more into the idea of tracking a few select (((organizations))) and map the response of provocations on social media, media coverage from operations etc and see what can be learned from that.
I speculate his is one of the factors accounting for machine learning coming into existence. (((they))) have an entire field of mathematics developed since 40s/50s to describe "industries", what we would call factories or businesses, as block diagram components/objects that have responses to stimuli, and many transforms and commonly used theorems would apply. Laplace, DCT, wavelets etc..
(((they))) calibrate their modeling by for example dumping one specific type of stock one day, and measure the system response as it propagates through an entire national/regional/global economy.
Even if it is barely detectable to us, they know the exact stimuli introduced to the system, and know which metrics to track for good generalization.
So in a way they end up with stacked differential equations, which happens neural networks are great at solving.
I was speculating that the same can be done to track censorship, responses, shilling, fake news due to known events, if one fabricates memes designed to trigger cascading buzz and ripples across the duck pond. Or latch on tracking organic memes that get traction.
Modeling an entire economy is impressive and abstract, but tracking say two dozen NGOs, some news outlets, A_D_L, twatter and a few other sources is not an impossible undertaking.
If it works it ought to give a good reference what types of memes have the most spread, most outrage, trigger most/least action from said (((groups))), plebs, forums, what associated tags are used.
There should be a few sweet spots that can be exploited, instead of resorting to trial and error, time and random nature of chan culture.
If a coordinated effort was made by people aware of such metrics, the impact could, I theorize be 100 times greater than forcing/pushing what ever ideas trickle down the grapevine.
Of course one could argue that a psychological approach to get normies into reacting is as valid, problem is we generally make the wrong assumptions and people are effected by everything going on.
Measuring total system response has always worked, and I see no reason it cant be used to look for sore points...
I want to be clear, I don't believe this will dramatically impact the world, but rather provide possibly valuable insights for meming IRL and online.