The U.S. space agency launched a new web-based search engine for much of its catalog of images, video and audio files, which you can browse by keyword and metadata, so that you never have to remember the dismal reality that you’re earthbound ever again.
Rockfleet Castle, Co. Mayo, Ireland. It’s a former home of Grace O`Malley (Gráinne Mhaol), the famous 16th century ‘Pirate Queen’.
Photo: Mikeoem (CC-BY-SA-4.0 )
Millions of posts are published on Tumblr everyday. Understanding the topical structure of this massive collection of data is a fundamental step to connect users with the content they love, as well as to answer important philosophical questions, such as “cats vs. dogs: who rules on social networks?”
As first step in this direction, we recently developed a post-categorization workflow that aims at associating posts with broad-interest categories, where the list of categories is defined by Tumblr’s on-boarding topics.
Posts are heterogeneous in form (video, images, audio, text) and consists of semi-structured data (e.g. a textual post has a title and a body, but the actual textual content is un-structured). Luckily enough, our users do a great job at summarizing the content of their posts with tags. As the distribution below shows, more than 50% of the posts are published with at least one tag.
However, tags define micro-interest segments that are too fine-grained for our goal. Hence, we editorially aggregate tags into semantically coherent topics: our on-boarding categories.
We also compute a score that represents the strength of the affiliation (tag, topic), which is based on approximate string matching and semantic relationships.
Given this input, we can compute a score for each pair (post,topic) as:
where
w(f,t) is the score (tag,topic), or zero if the pair (f,t) does not belong in the dictionary W.
tag-features(p) contains features extracted from the tags associated to the post: raw tag, “normalized” tag, n-grams.
q(f,p) is a weight [0,1] that takes into account the source of the feature (f) in the post (p).
The drawback of this approach is that relies heavily on the dictionary W, which is far from being complete.
To address this issue we exploit another source of data: RelatedTags, an index that provides a list of similar tags by exploiting co-occurence patterns. For each pair (tag,topic) in W, we propagate the affiliation with the topic to its top related tags, smoothing the affiliation score w to reflect the fact these entries (tag,topic) could be noisy.
This computation is followed by filtering phase to remove entries (post,topic) with a low confidence score. Finally, the category with the highest score is associated to the post.
This unsupervised approach to post categorization runs daily on posts created the day before. The next step is to assess the alignment between the predicted category and the most appropriate one.
The results of an editorial evaluation show that the our framework is able to identify in most cases a relevant category, but it also highlights some limitations, such as a limited robustness to polysemy.
We are currently looking into improving the overall performances by exploiting NLP techniques for word embedding and by integrating the extraction and analysis of visual features into the processing pipeline.
What is the distribution of posts published on Tumblr? Which categories drive more engagements? To analyze these and other questions we analyze the categorized posts over a period of 30 days.
Almost 7% of categorized posts belong to Fashion, with Art as runner up.
The category that drives more engagements is Television, which accounts for over 8% of the reblogs on categorized posts.
However, normalizing by the number of posts published, the category with the highest average of engagements per post isGif Art, followed by Astrology.
Last but not least, here are the stats you all have been waiting for!! Cats are winning on Tumblr… for now…
I seek the Word of @cranquis and the Word of @wayfaringmd on proper tick removal technique. Where I live we’re being warned that populations will be high this summer and that 50% of the ticks are testing positive for Lyme disease. Between the cat, the dog, and three kids I know I’m going to have to deal with them soon.
I’m hearing so many conflicting things, even from MDs. Burn them with a match? Pour olive oil on them? I thought we weren’t supposed to do that stuff? Should I buy that fancy tick remover thing? Or does each one require a trip to the office? Help me Crayfaring, you’re my only hope! 😩
Known as bubble algae or sailor’s eyeballs, Valonia ventricosa are one of the world’s largest single-celled organisms. They’re found in almost every ocean in the world, mostly in tropical and sub-tropical regions among coral rubble.
These tough, shiny multi-nucleic cells, a kind of green algae, usually grow to be 0.4 to 1.5 inches in diameter but sometimes reach up to 2 inches across. By comparison, most human cells are so small they’re invisible to the naked eye; Valonia ventricosa are larger than your fingernail!
photograph by Alexander Vasenin | Wikipedia
via: American Museum of Natural History
The University of Reading holds the archive of original artwork for the much-loved Ladybird children’s book. This painting on board was used to illustrate Exploring Space, a Ladybird ‘Achievements’ Book first published in 1964. The artwork was created by Brian Knight.
If you look closely at the painting, you can see the faint trace of Knight’s initial design for the lunar landing module - just visible under the later amendment.
Published before the first Moon landing in 1969, the fantasy spacecraft was sleek and utopian. It typifies the extent to which The Space Race captured our mid-century imaginations and permeated visual culture. The later correction, based on the Eagle Lunar Module, was printed in subsequent revisions to the book. It was an acknowledgment of a successful mission and testament to Ladybird’s emphasis on accuracy for its young readers.
All artwork is © Ladybird Books Ltd.
I am pretty sure that this cannot be true because I saw an ad from the corn industry that said high fructose corn syrup is good for you…
A range of diseases – from diabetes to cardiovascular disease, and from Alzheimer’s disease to attention deficit hyperactivity disorder – are linked to changes to genes in the brain. A new study by UCLA life scientists has found that hundreds of those genes can be damaged by fructose, a sugar that’s common in the Western diet, in a way that could lead to those diseases.
However, the researchers discovered good news as well: An omega-3 fatty acid known as docosahexaenoic acid, or DHA, seems to reverse the harmful changes produced by fructose.
“DHA changes not just one or two genes; it seems to push the entire gene pattern back to normal, which is remarkable,” said Xia Yang, a senior author of the study and a UCLA assistant professor of integrative biology and physiology. “And we can see why it has such a powerful effect.”
Qingying Meng, Zhe Ying, Emily Noble, Yuqi Zhao, Rahul Agrawal, Andrew Mikhail, Yumei Zhuang, Ethika Tyagi, Qing Zhang, Jae-Hyung Lee, Marco Morselli, Luz Orozco, Weilong Guo, Tina M. Kilts, Jun Zhu, Bin Zhang, Matteo Pellegrini, Xinshu Xiao, Marian F. Young, Fernando Gomez-Pinilla, Xia Yang. Systems Nutrigenomics Reveals Brain Gene Networks Linking Metabolic and Brain Disorders. EBioMedicine, 2016; DOI: 10.1016/j.ebiom.2016.04.008
Americans get most of their fructose in foods that are sweetened with high-fructose corn syrup, an inexpensive liquid sweetener made from corn starch, and from sweetened drinks, syrups, honey and desserts. The Department of Agriculture estimates that Americans consumed an average of about 27 pounds of high-fructose corn syrup in 2014. Credit: © AlenKadr / Fotolia
im putting together a couple of scottish folk mixes bc that’s what i do and im honestly curious if anyone in my country has ever been unequivocally happy about anything ever
A reblog of nerdy and quirky stuff that pique my interest.
291 posts