/g/ - Technology

Computers, Software, Technology

Posting Mode: Reply Return

Max message length: 5000


(used to delete files and postings)


  • Supported file types: GIF, JPG, PNG, WebM, OGG, and more
  • Max files: 5
  • Max file size: 50.00 MB
  • Read the global rules before you post, as well as the board rules found in the sticky.

02/27/20 IRC/Matrix/and Discord servers are now available, join here.
02/09/20 /ausneets/ has been added!
11/23/19 Donations can now be made via PayPal, thank you for all of your support!
11/21/19 We have successfully migrated to LynxChan 2.3.0, to see all of the changes to 16chan, click here.

[Index] [Catalog] [Archive] [Bottom] [Refresh]

(69.83 KB 225x225 glownigger.png)
(201.01 KB 520x1755 cultstate.png)
OSINT Tools and Techniques Anonymous 01/11/2020 (Sat) 04:50:37 ID:4dfa70 No. 913
Tools to help level the playing field in this psywar: Link to original (archived) thread: http://archive.is/MQrCd A little taste from the thread (please add any tools not contained within): Facebook Ad Library https://www.facebook.com/ads/library/?active_status=all&ad_type=political_and_issue_ads&country=US >The Ad Library provides advertising transparency by offering a comprehensive, searchable collection of all ads currently running from across Facebook Products. Anyone can explore the Library, with or without a Facebook account. This can be used to see where advertising is being used to attract and coordinate Antifa. SnapStory https://github.com/sdushantha/SnapStory >Scrape public storie1 from SnapChats. Useful for gathering intelligence on Antifa nodes using SnapChat. Entro.py https://github.com/andrew-vii/Entro.py >Monitors chat status activity of Facebook uses to know when they are on their device and when they are not. InstaLooter https://github.com/althonos/InstaLooter >Quickly scrape all Instagram photos from a person. SpiderFoot https://www.spiderfoot.net/ >SpiderFoot is a reconnaissance tool that automatically queries over 100 public data sources to gather intelligence on IP addresses, domain names, e-mail addresses, names and more. When combined with basic emoji replacement attacks (self-hosted images used in private chats that look like an emoji but give you the IP address of people who see them), this can be incredibly useful at identifying people from an IP address alone. Face Recognition https://github.com/ageitgey/face_recognition >Useful when paired with image scrapers to identify people in videos and images. linkedin2username https://github.com/initstring/linkedin2username >Generate username lists from companies on LinkedIn. Universal Reddit Scraper https://github.com/JosephLai241/Universal-Reddit-Scraper >This is a universal Reddit scraper where you can manually specify subreddits to scrape, specify which category of posts to scrape for each subreddit, and how many results are returned. Twint https://github.com/twintproject/twint >Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API. Citadel https://github.com/jakecreps/Citadel >Curated list of vetted and useful OSINT tools. PwnBin https://github.com/kahunalu/pwnbin >PwnBin is a webcrawler which searches public pastebins for specified keywords. Sherlock https://github.com/sherlock-project/sherlock >Find a username across multiple social media networks. SkipTracer https://github.com/xillwillx/skiptracer >Get basic PII on a target from paywall sites. Graph Commons https://graphcommons.com/ >A useful tool to build graphs that detail and share the relationships between people, organizations, and data. Milanote https://milanote.com/ >A useful tool to build graphs that also lets you link to other public Milanote graphs. Draw.io https://draw.io/ >A basic diagramming tool. GeoCreepy https://www.geocreepy.com >A Geolocation OSINT Tool. Offers geolocation information gathering through social networking platforms.
>>920 >>922 >What was 8ch's Jewdar?
>>923 Yes but images are only one possible metric, anything available online can be combined with an image classifier. You can weigh your results with posting location/ip, vocabulary, style of writing, types of links shared etc. While on the topic, was thinking about a tool to expose shills. Either a bot, or a browser plugin. If one saves enough shill posts and agitating/divisive posts of the chans it should be possible to make a shill-meter. Not perhaps the most accurate, but can be very beneficial for newfags and to at least show case the vast number of shills/narrative control. Ideally a posting bot that breaks down posting patterns at 50 or 100 post intervals. The same way 8chan trained their image classifier you can train a NN with word sequences, location, posts by user, make dictionaries of words common/uncommon. All feds/shills tend to be leftist and the left cant meme, they don't get spergy humor and memes, write like boomers, reddit space, use normie humor, rcopy pasta, post during US working hours, there are tonnes of metrics to use, this should be somewhat easy to classify at a 90% + accuracy. ( not refering to the 1 post by user memeflag agitator but debating shills ) Why I'm thinking about such things is that GENERALLY the current "right" has nothing to hide, care about the truth, so the tool cant really be used against us the same way...
>>921 Sadly I do not. This is the place, and as you can see 16pol has yet to run a "successful" operation. Please start a thread if you have something you'd like to work on that you need help with. Your idea for creating a bot that analyzes jew intentions is a decent idea, I would have to see the results to be able to judge if such a thing is useful. This is far beyond my present skill level. >ShillSpotter9000 Bretty good idea. I don't personally have the ability to help with this project, but it is definitely worth pursuing. It could help the mods be more efficient as well. This site could make use of an image classifier as well. I would be happy to help work on this, but you would have to tell me what you need.
>>925 I know vanilla NNs/MLPs in and out, i work mostly with tabular data, all the math, all features, all regularization etc and can build them in python from scratch, but I cant scratch build RNNs or CNNs. Luckily there are libraries to do this... Currently starting to teach myself RNNs which is what to use for sequence data. CNNs I only have limited experience of, but I of course understand them, the only addition is the convolution. For this project I think a conventional vanilla NN/MLP can be used to analyze output from a RNN and eventual image classifier and weigh it with "tabular" metrics like location, time, character count, etc etc. People can trash talk vanilla NN/MLP all they want but it is a "universal approximator" and can do lots of regression, encoding and mapping. At the moment I cant make such an app myself, only provide part of the main functions, collect data, train, optimize etc. But I'm not in a hurry. We will need - Shell/site/server/app/container - Source code doing analysis - Training data - Scrapers I have been around dark/clear-net and looked for fun places. SADLY, this is actually one of the best forums right now, but sparsely populated. I don't mind though, because it is more suitable for running ops. We need to start posting more, and try to get intelligent anons here from 4pol/8kun without blowing the cover. 50 active intelligent people on here is more worth than 4pol with 100 000 users at its current state. I personally like this forum, and think if we just make sure to participate actively here more people will join.
>>926 >try to get intelligent anons here from 4pol/8kun without blowing the cover If you build it, they will come. Let them filter in on their own for now. This is not only my own opinion, but is the general consensus gleaned from at least a couple threads on the topic. >start posting more I only post when I have something to add to the conversation. I spent a little time here during the holidays then took a long break. I don't think we need to post more, we just need to post with purpose. Debate on various topics is important, but personally I am pushing for a change in gears from a research and analysis perspective of jewish influence and our own history to a phase of operations, happenings, and active participation in the "movement" irl. There are many people on this site willing to do their part, we just have to give them something to do (and keep blowing on the fire until it rages). I have said it before: we need another thotaudit, but with a better goal than taxing internet whores.
>>927 Well said! I'm at a similar place in my mind. Going to spend most my time making memes and meming IRL and working projects. Things do indeed need to happen. That is why the idea to track (((them))) or shills is interesting to me. So if you make a meme, you can literally see the buzz generated from who, where, what they say, how it is being mentioned. It sounds very complex, and is, if accurate representation of the general population is the goal. But tracking only a specific group could be really doable with all the abundant metrics. Hopefully one can create tools and break down the knowledge learned from the project so people get it. The beauty again is it cant be weaponized against us, because we are not trying to hide information and deceive A first step could be just monitoring shilled topics, it is obvious that shilling takes place certain hours, certain topics. Just posting infograpics with most shilled topics, times of day, word cloud, number of posts, thread length, replies, number of respondents etc, things easily scraped that don't require understanding a text would be reasonable. Basically just scraping and refreshing the catalog... If a project like this gets crawling at a steady pace I think more people will join. Now that github exists and most of the code can be copied from various projects it is rather easy to co-develop too.
TOOL FOR GENERATING FACES https://www.thispersondoesnotexist.com/ > Neural Network based generation of photorealistic faces of persons that do not exist. > Great for training neural networks or just trolling on social media. Can potentially be used to flood face recognition software in the future.
(194.82 KB 531x610 0.png)
(162.20 KB 614x405 1.png)
(160.50 KB 612x400 2.png)
(64.44 KB 1016x242 3.png)
(53.34 KB 621x717 4.png)
>>926 > get intelligent anons here from 4pol/8kun That's asking a bit much. ;^) but yes, they will filter through when as they mature. That said there are a few decent posters there I recognize from 8ch that are cramming redpills down ledditor throats. >>924 >this should be somewhat easy to classify at a 90% + accuracy. We had this from Joangate when building an image classifier in one thread. Pics related. She was using UUIDs for images to track how they propagated across the chans. So we built an image classifier and discovered almost the entire diy weapons thread in /k/ was glowing full of them. There was even effort to generate memes that were MD5 clashes of the images she was using, so they would be indistinguishable to her tracker and really fuck with her dataset. The net result was they stopped using that method and fractured the chans soon after. It worked out for the best as it pushed a lot of us to operate irl instead. >>928 >Just posting infograpics with most shilled topics, times of day, word cloud, number of posts, thread length, replies, number of respondents etc, things easily scraped that don't require understanding a text would be reasonable. >Basically just scraping and refreshing the catalog... A few anons had scripts to do this too. They would auto post thread stats that showed (1)s and other data.
(845.16 KB 2632x1304 OSINT_Landscape_v1.jpg)
(656.95 KB 1422x785 1 yQf33KwCUEeUt3lXde1lQw.png)
(618.58 KB 2000x3000 1 cuTSPlTq0a_327iTPJyD-Q.png)
(118.70 KB 780x578 1 CYWUWV2JyY-ZqsRKHIb7Jg.jpeg)
BASIC GUIDE TO SCRAPE & ANALYZE WITH NEURAL NETWORK Assuming you know a little programing, if asked to chose one thing to learn in OSINT and data mining, for me it is machine learning, AI, or Neural Networks, it is all more or less different words for the same thing. It is unbelievably powerful to be able to approximate any abstract function between an input and an output. It is the future, and can be implemented in almost every imaginable field. PYTHON Python has become insanely popular and is very forgiving and easy to use compared to low level languages. Even someone with basic web scripting knowledge can get into it and learn the syntax fast. There are vast sources for information, tutorials, manuals, git etc. Several machine learning frameworks built on python. NEURAL NETWORKS Neural networks have been around since the 1950s, it sounds more complicated than it is. The network loops through training data and maps or "learns" arbitrary simple nonlinear functions between inputs and successive layers. It is basically just weighted sums of nodes in a previous layer stacked in an array. This is the core of "AI" the reason it is booming as of the last decades is not advancements in modelling, but accessibility of computing power/performance. BASIC PRINCIPLE A NN is basically just a tree of nodes interconnected with each other. Layers of stacked weighted sums. A single node/neuron response is more or less just linear regression, the weighted sum of inputs P * Weight w, with a constant b, called bias and can be described as: Z = sum( P * w ) + b P = previous layer. A dot product is most commonly used. An activation function is used to remap the response to a non-linear function, so the node and network can learn abstract and non-linear relations. This is what differentiates a neural network from linear regression. Common activation functions are Sigmoid, TanH, RelU, LeakyRelU, Softmax. ARCHITECTURE Multi Layer Perceptron (MLP) is a universal model, basically doing stacked regression. Recurrent Neural Networks (RNNs) is for sequence data primarily, containing "memory cells", common for time series, word/character sequences, transcribing events in video etc. Convolutional Neural Networks (CNN) are more suitable for image data or data where distinct features are translated and not bound to a spectrum/table position. All types can be used interchangeably, but might suffer from lower accuracy, longer computing times etc. Many would argue that a RNN is the standard approach to time series forecasts, because the architecture uses timestep sequences as feature maps for cell/node activation. This means the RNN retains lists of activation sequences in its recurrent cell, meaning it "looks back" a defined number of steps, as opposed to just having weighted sums of the connections to the previous layer, as a MLP or vanilla NN. Now that is all great but for the purpose of introducing Neural Networks, it is too abstract to implement, forward propagation of data is less clear to understand code wise, and the backward propagation and derivatives needed for weight updates is very abstract to get a feel for initially. CONCLUSION In conclusion a MLP is recommended for learning this as a beginner. This is why I have focused on MLP problem solving in this post. Also, I would like to point out, there are machine learning libraries available, Keras and TensofFlow most notably, but I think it is counterproductive to not write the code from scratch when learning. Expect a steep learning curve, it is not the most simple thing in the world, but it gets easy with time, like learning a new language. RESOURCES OSINT AML TOOLBOX https://medium.com/@aml_toolbox/financial-crimes-osint-tools-banking-5ede7edbc14f https://start.me/p/rxeRqr/aml-toolbox?embed=1 > Compiled links from a money laundering analyst (((relevant))). OUTSTANDING VIDEO EXPLANATION OF MLP. https://www.3blue1brown.com/neural-networks > This is how I got into this stuff, this guy is great at making complex math easy to grasp. CHAIN RULE https://en.wikipedia.org/wiki/Chain_rule > Arguably the only abstract and difficult part in building a basic neural network, sucks but rather necessary to know intuitively. PYTHON SCIPY MANUAL https://docs.scipy.org/doc/numpy/index.html PYTHON WEB SCRAPING BREAKDOWN https://realpython.com/beautiful-soup-web-scraper-python/ > Beginner friendly approach to basic website scraping in python. BASIC VANILLA NEURAL NETWORK (MLP) STOCK PRICE TUTORIAL https://medium.com/mlreview/a-simple-deep-learning-model-for-stock-price-prediction-using-tensorflow-30505541d877 > Time series data can be predicted with a conventional feed forward neural network / MLP, they are considered as universal approximatiors. >BASIC STOCK SCRAPING PYTHON TUTORIAL https://towardsdatascience.com/stock-market-analysis-in-python-part-1-getting-data-by-web-scraping-cb0589aca178 > Applied exa
>>930 Amazing, I completely missed the shill tracker. Were you tracking md5 check sums even 1 changed pixel would generate a different hash. So MD5 can be used as a first stage classification of obvious duplicates, but I think a CNN based image classifier is a must. Also a sequential analysis of word count/phrasing would be good, since shills post short one liners, snide remarks, and don't have time to author organic posts with real argumentation. On top of that I speculate you should factor in time of day, relative occurrence of similar pictures at time of posting, thread topic, and a few other parameters to further improve prediction. Ideally I think generating a ghost profile for a suspected shill is helpful too. If the posting behavior has been tagged before, during the same shift, using similar images and phrasing etc. At the end of the day, I wonder what relevance such a project would have today, when boards are fragmented. It could make great memes to pass around reminding plebbitors, teens and newfags what is going on on the chans. It could be expanded into providing interesting functions to the smaller boards, to give certain users a hidden "social credit score" to avoid infiltration in more serious projects. Why I brought the topic up is I was more into the idea of tracking a few select (((organizations))) and map the response of provocations on social media, media coverage from operations etc and see what can be learned from that. I speculate his is one of the factors accounting for machine learning coming into existence. (((they))) have an entire field of mathematics developed since 40s/50s to describe "industries", what we would call factories or businesses, as block diagram components/objects that have responses to stimuli, and many transforms and commonly used theorems would apply. Laplace, DCT, wavelets etc.. (((they))) calibrate their modeling by for example dumping one specific type of stock one day, and measure the system response as it propagates through an entire national/regional/global economy. Even if it is barely detectable to us, they know the exact stimuli introduced to the system, and know which metrics to track for good generalization. So in a way they end up with stacked differential equations, which happens neural networks are great at solving. I was speculating that the same can be done to track censorship, responses, shilling, fake news due to known events, if one fabricates memes designed to trigger cascading buzz and ripples across the duck pond. Or latch on tracking organic memes that get traction. Modeling an entire economy is impressive and abstract, but tracking say two dozen NGOs, some news outlets, A_D_L, twatter and a few other sources is not an impossible undertaking. If it works it ought to give a good reference what types of memes have the most spread, most outrage, trigger most/least action from said (((groups))), plebs, forums, what associated tags are used. There should be a few sweet spots that can be exploited, instead of resorting to trial and error, time and random nature of chan culture. If a coordinated effort was made by people aware of such metrics, the impact could, I theorize be 100 times greater than forcing/pushing what ever ideas trickle down the grapevine. Of course one could argue that a psychological approach to get normies into reacting is as valid, problem is we generally make the wrong assumptions and people are effected by everything going on. Measuring total system response has always worked, and I see no reason it cant be used to look for sore points... I want to be clear, I don't believe this will dramatically impact the world, but rather provide possibly valuable insights for meming IRL and online.
>>924 >All feds/shills tend to be leftist and the left cant meme, they don't get spergy humor and memes, write like boomers, reddit space, use normie humor, rcopy pasta, post during US working hours, there are tonnes of metrics to use, this should be somewhat easy to classify at a 90% + accuracy. ( not refering to the 1 post by user memeflag agitator but debating shills ) hur durr we stronk <aka the stupid myth that needs to die <NOW
>>919 I saved the a copy of the whole thread through Firefox and I have a .png file of that picture from the files. I'm not sure if the picture saved from those files would be a copy of the "original" with the text file (also since the file size of the picture is also 8kb) but see if you can save it to a .zip.
>>934 I tried that too. Unfortunately it doesn't seem to work.
>>933 No kidding. >The left can't meme Just lol, thank you for calling him on that. If he wasn't so eager to help I would have told him to lurk another year.
>>933 >>936 I don't count subversion, or geeky spooks who sit around trying to create something witty. I mean in general, the left and adherents of the mainstream ideology lack distance to the self/ego and their jokes are generally construed. I can provide many examples of this and how it is expressed in other cultures sharing the same type of adherence and consensus culture. I studied with Muslims briefly at university, they lack creative thinking, independence and are generally stiff, same goes for Chinese people and many Asian cultures with strong peer pressure. Western lefties are the same as radical muslims, but their need to believe and belong manifests as obsession with the "progressive" ideals and the wonders of equality instead of scripture. But of course there are exceptions, where I live the left is notorious for keeping death lists of opponents and secretly waiting for revolution. These are not "leftists" but psychopaths exploiting the gullible through left wing ideology, they usually end up killing each other to prevent coups. Those people are scary... But go ahead, post examples of successful leftist memes that have taken the internet by storm...
(57.12 KB 1200x724 try.png)
(1010.73 KB 1434x816 harder.jpg)
(134.89 KB 1642x924 sweaty.jpg)
>>937 I'm being polite here: please lurk more.
>>932 >Were you tracking md5 check sums UUID + MD5 + SHA (various) >even 1 changed pixel would generate a different hash I haven't a coffee yet this morning so might be missing why you mentioned this so, wtf? Its kinda like the whole idea about hashes isn't it? Anons were working on generating colorful memes that were MD5 clashes of Joan's tracked images for the lulz. ;^)
>>913 just bumping. don't want this thread to get pushed off the catalog
>>940 Seconded. The kikes are obviously using AI to shill and to censor the fuck out of everything. We need to turn it back around on them.
>>940 Final bump, but next time near bottom will transfer to >>>/g/ for safe keeping
>>942 This is a good thread. You should consider stickying it. The politics boards are a decentralized OSINT information aggregation attack against the aristocracy. Every board should have a thread like this encouraging people to collect and organize information for expedient consumption instead of trolling about niggers.
>>943 OP here. Thank you. /g/ seems like a reasonable place to keep it. Maybe sticky it there, I agree it is important to share and build on this information in times like these. Really "we" should be working to create private networks for ourselves (not based on larping). The FEDs have moved us pretty high up the list. I still don't have anyone I can trust irl.
Paranoid People use Exiv2 to remove metadata from some picture types: exiv2 -d a *.png *.jpg Paranoid People use FFMPEG to remove some metadata from videos: ffmpeg -y -i my_input_file.mp4 -c copy -map_metadata -1 -metadata title="some file" -metadata creation_time=1997-01-01T01:01:00 -map_chapters -1 my_output_file.mp4 Tip: The output format must be the same as the input format. Convert to another format in a second pass. Paranoid People reset the timestamp on their files before adding to archive: touch -t 199701010000 ./my_files/* Paranoid People do not let 7-zip add metadata to files: hint: -mtm- 7za a -t7z -m0=lzma2 -mx=9 -aoa -mfb=64 -md=32m -ms=on -mtm- ./file.7z ./dir/* or if you want to password encrypt the files, also add: -mhe=on -p You can use a GPG keypair but all cryptography done with this GPG key can be decrypted using the same key. Paranoid People don't use real names on their computer login. Beating word filters: Put a message in a text file or use echo and then, echo -en 'Fuck The Word Filters' | base64 -w0 Output: RnVjayBUaGUgV29yZCBGaWx0ZXJz Now lets decode it ... echo -en 'RnVjayBUaGUgV29yZCBGaWx0ZXJz' | base64 -d Output: Fuck The Word Filters But ... Can I do this with a file, you ask? cat f.jpg | base64 -w0 > f.txt Hint: -w0 disables line wrap. Adjust as needed. cat f.txt | base64 -d > f.jpg If you have a highly compressable file, you can also pipe to xz or bzip2 or 7za (7-zip). cat some_big_text_file.txt | xz -9ec | base64 -w0 > some_big_file_is_smaller_and_encoded.txt Maybe a site allows text files, but not mp3 or 7-zip or whatever. cat my_song.mp3 | base64 > my_song.txt Then others can decode it with: base64 -d ./my_song.txt > ./my_song.mp3 spam edited
Edited last time by Anonymous on 05/18/2020 (Mon) 23:56:50.
(402.89 KB 1024x673 dox2.jpg)
(188.17 KB 748x1059 Googlefu.gif)
Bamp and related optimizations of data aggregation.
>>945 Any work arounds for the YT shadow banning? Any clue how do the algorithms or filters work? Still the best platform to get information out in the comments, we must not be silenced by them.


no cookies?