Social Media, Data Mining & Machine Learning

ACM TIST Special Issue on Search and Mining User-generated Contents

2010-10-12T13:45:00.002+02:00

Deadline: 1 December 2010
More info at ACM TIST webpage

Social Media have been able to shift the way information is generated and consumed. At first, information was generated by one person and “consumed” by many people, but nowadays most part of the information available in the Web is generated by users, which has changed the needs in information access and management. Social Networks like Facebook or Twitter manage tens of PB of information, with flows of hundreds of TB per day, and hundreds of billions of relationships.

User generated content provides an excellent scenario to apply the metaphor of mining any kind of information. In a social media context, users create a huge amount of data where we can look for valuable nuggets of knowledge by applying diverse search (information retrieval) and mining techniques (data mining, text mining, web mining, opinion mining). In this kind of data, we can find both structured information (ratings, tags, links) and unstructured information (text, audio, video), and we have to learn how to combine existing techniques in order to take advantage of the existing information heterogeneity while extracting useful knowledge.
The primary goal of this special issue of ACM Transactions on Intelligent Systems and Technology is to foster research in the interplay between Social Media, Data/Opinion Mining and Search, aiming to reflect the actual developments on technologies that exploit user generated contents.

Topics of Interest

We invite researchers and professionals from a broad range of disciplines to submit to this special issue. Papers may encompass any or all of the following types of works: foundational theoretical analyses, modelling, simulation, and empirical studies. Moreover, authors may examine different aspects of search and mining user generated contents in a variety of possible contexts. Topics of interest include, but are not limited to:

 A) Mining Social Media

Social networks analysis/mining
Tagging/links/graphs analysis and mining
Community detection and evolution
Influence, trust and privacy analysis
Social media monitoring/analysis

B) Opinion Mining and Sentiment Analysis

Opinion extraction/classification/summarization/visualization
Temporal sentiment analysis
Cross-lingual/cross-domain sentiment analysis
Irony detection in opinion mining
Wish analysis
Product review analysis

C) Search in Social Media

Novel social search algorithms
Social ranking
Multi-entity search
Multifaceted search
User Modelling and Personalization in Social Media

D) Other Social Intelligent Systems

Social Recommender systems
Semantic Social Media
Market analysis
Cross-lingual/cross-domain social intelligent systems

Submission Guidelines

Manuscripts submitted to the special issue should contain original material not published in nor submitted to other journals. Each paper will be reviewed by at least 3 expert reviewers. Papers which do not meet publication quality standards, or does not pass the editorial assessment of suitability of this special issue will be rejected before the review process.
Full papers should be sent via TIST's On-Line Submission system http://mc.manuscriptcentral.com/tist (please select “Special Issue: Search and Mining User Generated Contents” as the manuscript type), and should not exceed 20 pages length. Details of the journal and manuscript preparation are available on the website: http://tist.acm.org/

SMUC Workshop (CIKM): Open Invitation to Researchers from the Industry

2010-08-12T21:35:00.004+02:00

This year we organize SMUC Workshop (Search and Mining User-generated Contents) at CIKM. We are now organizing an industry panel on related topics, and we would like to count on several researchers from big and small companies for this panel.

We think industry related activities on workshops like SMUC are very important in order to make it easier for academic researchers to discover new trends and research topics, and for allowing a more direct conversation between the academic and the industry worlds.

If you work on any topic related to SMUC Workshop (social search, mining social networks, opinion mining, etc.) for an start-up/small/medium/big company, you're planning to attend CIKM, and you'd like to join us in this industry panel, please contact us at josecarlos.cortizo@uem.es. We'd be pleased to count on you for this industry panel.

Deadline extension: Workshop NLP in the Enterprise: Envisioning the Next 10 Years

2010-06-18T17:25:00.001+02:00

The workshop "NLP in the Enterprise: Envisioining the next 10 years" (PLN-E) will be held in Valencia (Spain), as a satellite workshop of SEPLN 2010, on September 6-7, 2010.

PLN-E aims to become a meeting point among academic researchers and companies interested on technologies all along Natural Language Processing and other related technologies like text mining, information retrieval, opinion mining, etc.

TOPICS OF INTEREST
================

We aim the submission of works that shows how NLP can help enterprises to solve problems, or develop new products and services. We expect 3 different types of works:

* (Academic/enterprise) Research works at early stages but with a clear orientation to enterprise applications (products/services)
* Works showing how a certain or several NLP technologies have been successfully used to solve a certain problem (such as developing a new product/service)
* Works from companies explaining certain problems that can not be solved with the actual state of the art on NLP.

Topics of interest include, but are not limited to, the next topics:

A) New NLP Technologies with a clear Application in the Enterprise

1. Opinion Mining and Sentiment Analysis
2. Monolingual and Multilingual Information Systems
3. Voice Question-Answering Systems
4. Social Intelligent Systems
5. Plagiarism Detection and Systems for Detecting Confidential Information
6. Recommender Systems
7. Illicit contents/Crimes Detection Systems

B) New Application Domains

1. Social Media
2. Mobile Devices
3. Videoconsoles and TDT
4. Virtual Worlds and Massive Multiplayer Games

C) Business Aspects of NLP

1. New Business Models
2. Unsolved Problems
3. Scalability and Performance of NLP Technologies
4. Production Environments for NLP Systems
5. Real-World Datasets Descriptions
6. Demands and Market Needs

DATES
=================

* Deadline [extended]: 28 June 2010
* Acceptance: 15 July 2010
* Camera due: 15 August 2010
* Workshop: 6/7 eptember 2010

ORGANIZING COMMITTEE
==================

* Jose Carlos Cortizo (BrainSins, Universidad Europea de Madrid) - contact person (josecarlos.cortizo@wipley.com)
* Jose Maria Gomez (Optenet)
* Francisco Manuel Rangel (Corex)
* Victor Peinado (MAVIR)
* Hugo Zaragoza (Yahoo! Research)
* Francisco Manuel Carrero (BrainSins, Universidad Europea de Madrid)

PROGRAM COMMITTEE
==================

* Rodrigo Agerri (Vicomtech Research Centre)
* Matxalen Alfaro (Sarenet)
* Claudio Baccigalupo (VLEX)
* Sergio Berna (ExperienceOn)
* Enric Castellon (Thera)
* Juan Manuel Cigarran (UNED, Consorcio MAVIR)
* Jesus Contreras (ISOCO)
* Javier Cuervo (Redepyme, EOI)
* Luis Ignacio Diaz (Acciona I+D)
* Jose Gregorio Escalada (Telefonica I+D)
* Diego Exposito (Answare Technologies)
* Angel Faus (VLEX)
* Jorge Garcia Betanzos (Sarenet)
* Ricardo Farreres (Thera)
* Anabel Fraga (UC3M/REUSE)
* Juan Antonio Garrido (i2factory)
* Galo Gimenez (HP)
* Francisco Gomez Molinero (Visual Tools)
* Jose Carlos Gonzalez (Daedalus)
* Carlos Gonzalez (ExperienceOn)
* Alberto Gragera (Tuenti)
* Francesc Grau (Conzentra)
* Carlos Lamas (T-Systems Iberia)
* Didac Lee (Inspirit, Spamina)
* Juan Llorens (UC3M/REUSE)
* Miguel Lucas (Acteo Soluciones)
* Diego Martin (Stratebi Business Solutions)
* Javier Martin (Loogic)
* Juan Carlos Martinez (Corex)
* Daniel Martinez (Indra Software Labs)
* Borja Monsalve (Social Gaming Platform)
* John Paul Moore (ATOS Research)
* Cesar de Pablo (UC3M, Consorcio MAVIR)
* Enrique Puertas (UEM, Consorcio MAVIR)
* Joaquin Rieta (TICSinergies)
* Federico Rodriguez (Stratebi Business Solutions)
* Miguel Angel Rodriguez (Telefonica I+D)
* Antonio Sanchez Valderrabanos (Bitext)
* Estela Saquete (UA, Consorcio MAVIR)
* Isabel Segura (UC3M, Consorcio MAVIR)
* Jim Shur (Strands)
* Jose Luis Suarez (Corex)
* Yaiza Temprado (Telefonica I+D)
* Marc Torrens (Strands)
* Paulo Villegas (Telefonica I+D)
* Pedro Vivancos (Vocali)
* David Zaragoza (Avanzis)

[CFP] SMUC 2010 Workshop @ CIKM 2010

2010-05-28T17:01:00.001+02:00

----------------------------------------------------------------------------------
CALL FOR PAPERS

SMUC2010: 2nd International Workshop on Search and Mining User-generated Contents
http://labs.brainsins.com/events/smuc2010
Workshop at CIKM 2010 (http://www.yorku.ca/cikm10/)
October 30, 2010, Toronto, Canada
Submission deadline: 30 June, 2010
-----------------------------------------------------------------------------------

SMUC10 aims to become a forum for researchers from several Information and Knowledge Management areas like data/text mining, information retrieval, semantics, etc. that apply their work into the fields of Social Media and Opinion/Sentiment Analysis where the main goal is to process user generated contents.

User generated content provides an excellent scenario to apply the metaphor of mining any kind of information. In a social media context, users create a huge amount of data where we can look for valuable nuggets of knowledge by applying several search techniques (information retrieval) or mining techniques (data mining, text mining, web mining, opinion mining, etc.). In this kind of data we can find both structured information (ratings, tags, links, etc.) and unstructured information (text, audio, video, etc.), and we must learn to combine existing techniques in order to take advantage of this heterogeneity while extracting useful knowledge.

TOPICS OF INTEREST
================

SMUC10 workshop is an extraordinary place where to present on-going works that exploit Social Media and/or use Opinion Mining technologies. Topics of interest include, but are not limited to:

A) Mining Social Media

Social networks analysis/mining
Tagging analysis/mining
Link and graphs analysis/mining
Community detection and evolution
Influence, trust and privacy analysis
Topic detection and trend discovery
Spamming and phishing detection
Wikipedia/Social Media vandalism
Social media monitoring/analysis

B) Opinion Mining and Sentiment Analysis

Opinion extraction, classification, summarization and visualization
Blogs analysis
Opinion flame
Temporal sentiment analysis
Cross-lingual/cross-domain sentiment analysis
Irony detection in opinion mining
Wish analysis
Product review analysis

C) Search in Social Media

Novel social search algorithms
Social ranking
Multi-entity/Multifaceted search
Multilingual and/or multimedia IR for Social Media
User Modeling and Personalization in Social Media
Architectures, scalability and efficiency

D) Other Social Intelligent Systems

Recommender systems
Semantic Social Media
Plagiarism detection
Market analysis
Cross-lingual/cross-domain social intelligent systems
Business Intelligence Applications (direct marketing, branding, etc.)

DATES
=================

* Submission: 30 June, 2010
* Notification of acceptance: 30 July, 2010
* Camera Ready: 15 August, 2010
* Workshop: 30 October, 2010

SUBMISSIONS
==================

Each contribution should not exceed the length of 8 pages, and must be prepared following ACM camera-ready template: http://www.acm.org/sigs/pubs/proceed/template.html

All papers must be submitted in Adobe Portable Document Format (PDF). Please ensure that any special fonts used are included in the submitted documents. Please use the following link to submit your paper: Easychair Submission System for SMUC 2010 http://www.easychair.org/conferences/?conf=smuc2010.

The workshop proceedings will be published as Eproceedings by the same publisher that publishes the CIKM main conference proceedings, and will be in the same CD that contains the CIKM'10 main conference Eproceedings.

We are actually on conversations with a couple of journals to organize a special issue with extended versions of selected papers. The information about this special issue will be published on the workshop's website.

ORGANIZING COMMITTEE
==================

* Jose C. Cortizo, BrainSins / European University of Madrid, Spain (contact person, josecarlos.cortizo@wipley.com)
* Francisco M. Carrero, BrainSins / European University of Madrid, Spain
* Ivan Cantador, Autonomous University of Madrid, Spain
* Jose A. Troyano, University of Seville, Spain
* Paolo Rosso, Technical University of Valencia, Spain

PROGRAM COMMITTEE
==================

[To be completed]

* Ahmed Abbasi (University of Wisconsin-Milwaukee, USA)
* Nitin Agarwal (University of Arkansas at Little Rock, USA)
* Enrique Amigo (National University of Distance Education, Spain)
* Ching-Man Au Yeung (NTT Communication Science Laboratories, Japan)
* Alexandra Balahur (Sapienza Università di Roma, Italy)
* Alberto Barrón (Technical University of Valencia, Spain)
* Dominik Benz (University of Kassel, Germany)
* Pushpak Bhattacharyya (IIT Bombay, India)
* Erik Cambria (University of Stirling, Scotland)
* Pablo Castells (Autonomous University of Madrid, Spain)
* Meeyoung Cha (Korea Advanced Institute of Science and Technology, Korea)
* Fermin Cruz (University of Seville, Spain)
* Victor Diaz (University of Seville, Spain)
* Viet Ha-Thuc (University of Iowa, USA)
* Akshay Java (MSN Microsoft, USA)
* Ralf Klamma (RWTH Aachen University, Germany)
* Zornitsa Kozareva (Information Sciences Institute at the University of Southern California, USA)
* Hady W. Lauw (Institute for Infocomm Research, Singapore)
* Luis Martin (BrainSins, Spain)
* Patricio Martinez-Barco (University of Alicante, Spain)
* Andres Montoyo (University of Alicante, Spain)
* Claudiu C. Musat (Politechnical University of Bucharest, Romania)
* Manuel Palomar (University of Alicante, Spain)
* Manos Papagelis (University of Toronto, Canada)
* Victor Peinado (National University of Distance Education, Spain)
* Isabella Peters (University of Duesseldorf, Germany)
* Martin Potthast (Bauhaus-Universität Weimar, Germany)
* Antonio Reyes (Technical University of Madrid, Spain)
* Horacio Rodriguez (Technical University of Catalonia, Spain)
* Efstathios Stamatatos (University of Aegean, Greece)
* Benno Stein (Bauhaus-Universität Weimar, Germany)
* Ralf Steinberger (European Commission, Joint Research Centre)
* Markus Strohmaier (Graz University of Technology, Austria)
* Jie Tang (Tsinghua University, China)
* Jordi Turmo (Technical University of Catalonia, Spain)
* Luis Alfonso Ureña (University of Jaen, Spain)

The Social Media Dataset

2010-04-18T15:50:00.003+02:00

Next thursday I'll be at UPV (Universidad Politécnica de Valencia) giving a speech to the students of "Computational Linguistics Applications" about how to use Social Media as a "dataset" on the fields related to NLP/IR/DM.

Social Media Dataset

View more presentations from José Carlos Cortizo Pérez.

SGP's 1st Funding Round

2010-04-15T15:23:00.003+02:00

We are now finishing our first funding series for SGP (Social Gaming Platform), a spin-off from MAVIR.

Our main activity is the development of technologies and services in the Internet, focusing on Social Media and E-Commerce. We have developed a Recommender System that can be integrated within any website via a REST API, and we have also developed Wipley, a social network for videogamers where we have integrated our recommender system.

SGP has been founded by 3 engineers (Francisco Carrero, Borja Monsalve and myself), with a large experience on founding start-ups, and R&D projects. We have been part of 15 R&D projects, funded by several entities like European Commission or the Spanish R&P Plan.

We are ready to commercialize our products, and that's why we have been looking for funding the last months. We have estimated a funding need of €550K for the two first years. Until this moment we have been granted with a CDTI's PID project with a total amount of €350K and we have obtained 0ther €100K from FFF, so we only need extra €100K.

If you are interested on investing at our company, please, contact us at the following e-mail: francisco.carrero(a)wipley.com. We'll give you all the information you need in order to evaluate the investment.

Workshop NLP in the Enterprise: Envisioning the Next 10 Years

2010-04-10T13:47:00.000+02:00

The workshop "NLP in the Enterprise: Envisioining the next 10 years" (PLN-E) will be held in Valencia (Spain), as a satellite workshop of SEPLN 2010, on September 6-7, 2010.

PLN-E aims to become a meeting point among academic researchers and companies interested on technologies all along Natural Language Processing and other related technologies like text mining, information retrieval, opinion mining, etc.

Topics of Interest

We aim the submission of works that shows how NLP can help enterprises to solve problems, or develop new products and services. We expect 3 different types of works:

(Academic/enterprise) Research works at early stages but with a clear orientation to enterprise applications (products/services)
Works showing how a certain or several NLP technologies have been successfully used to solve a certain problem (such as developing a new product/service)
Works from companies explaining certain problems that can not be solved with the actual state of the art on NLP.

Topics of interest include, but are not limited to, the next topics:

A) New NLP Technologies with a clear Application in the Enterprise

Opinion Mining and Sentiment Analysis
Monolingual and Multilingual Information Systems
Voice Question-Answering Systems
Social Intelligent Systems
Plagiarism Detection and Systems for Detecting Confidential Information
Recommender Systems
Illicit contents/Crimes Detection Systems

B) New Application Domains

Social Media
Mobile Devices
Videoconsoles and TDT
Virtual Worlds and Massive Multiplayer Games

C) Business Aspects of NLP

New Business Models
Unsolved Problems
Scalability and Performance of NLP Technologies
Production Environments for NLP Systems
Real-World Datasets Descriptions
Demands and Market Needs

Dates

Submission: 15 June, 2010
Notification of acceptance: 15 July, 2010
Camera Ready: 15 August, 2010
Workshop: 6/7 de September, 2010

CFPs on Recommender Systems

2010-03-25T11:37:00.004+01:00

There are several call for papers for conferences and journals related to Recommender Systems. Conferences:

IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. Deadline: 26 March 2010.
WI–IAT 2010 Workshop on Web Personalization and Recommender Systems (WebPRES 2010). Deadline: 16 April 2010.
Trust and Recommender Systems for Social Search and Web-Log Analysis. Deadline: 16 April.
ACM Recommender Systems. Deadline: 16 April (abstracts), 23 April 2010 (papers).

Journals/Special Issues:

Special Issue on: "Recommender Systems to Support the Dynamics of Virtual Learning Communities" on International Journal of Web Based Communities. Deadline: 30 March 2010.
Special Issue on "Social Linking and Hypermedia", on New Review of Hypermedia and Multimedia. Deadline: 11 June 2010.
Special Issue on "Social Recommender Systems", on ACM Transactions on Intelligent Systems and Technology. Deadline: 16 July 2010.
Special Issue on "Adaptation and Personalization for Ubiquitous Computing", on Personal and Ubiquitous Computing (Springer). Deadline: 30 November 2010.
Social Network Analysis and Mining, Springer.

ACM RecSys workshops were published some days ago. There are 7 different workshops related to Recommender Systems, so we'll have more CFPs soon.

CFP: APRESW Extended to March 15

2010-03-09T00:09:00.001+01:00

APRESW 2010 (1st International Workshop on Adaptation, Personalization and REcommendation in the Social-semantic Web), colocated under the ESWC 2010, has extended its deadline to March 15.

RecSys + ECML/PKDD = Barcelona, September

2010-03-06T11:35:00.002+01:00

This year's editions of RecSys (ACM Recommender Systems) and ECML/PKDD (European Conference on Machine Learning) will be held in Barcelona. ECML will be from 20th to 24th of September (Monday to Friday), and RecSys will start on Sunday (26th) and will finish 30th of September (thursday).

There is a clear topic relation between RecSys and ECML, in fact most of actual RecSys approaches has been proben in other fields (like data-mining, machine learning, information retrieval, etc.) before. For whom, like me, work around the two fields, is a great opportunity to assist two great conference in a single travel, and discover Barcelona, that is a really great city.

Opinion Mining/Sentiment Analysis resources

2010-03-02T15:37:00.004+01:00

We're actually trying to integrate Opinion Mining techniques into some of our future products, and I've been reading a lot about Opinion Mining and I found the survey "Opinion Mining and Sentiment Analysis" by Bo Pang and Lillian Lee a really valuable resource, as it contains an extensive bibliography, publicly available datasets and lexical resources, etc.

Another interesting source of contents has been WOMSA workshop (1st Workshop on Opinion Mining and Sentiment Analysis), held in November 2009, with some up-to-date applications.

I also found interesting that Evri API that offer some sentiment-related details about some entities that are indexed in Evri.

I'm convinced we'll see a lot of Opinion Mining powered applications and services during the next years, it's a really interesting field in terms of its applications in marketing, social media...

1st International Workshop on Adaptation, Personalization and REcommendation in the Social-semantic Web (APRESW 2010)

2010-01-08T14:27:00.001+01:00

Call for Papers: 1st International Workshop on Adaptation, Personalization and REcommendation in the Social-semantic Web (APRESW 2010)
30 or 31 May 2010 | Heraklion, Greece
http://nets.ii.uam.es/apresw2010/

In conjunction with the
7th Extended Semantic Web Conference (ESWC 2010)
http://www.eswc2010.org/

+++++++++++++++
Important dates
+++++++++++++++

* Paper submission: 7 March 2010
* Notification of paper acceptance/rejection: 5 April 2010
* Camera-ready copies of accepted papers: 18 April, 2010
* APRESW 2010 Workshop: 30 or 31 May 2010

++++++++++
Motivation
++++++++++

During the last years, researchers and practitioners of the Semantic Web have progressively consolidated a number of very important achievements. Formal languages have been standardized to define ontology-based knowledge representations, logic formalisms and query models. Ontology engineering methodologies and tools have been proposed to ease the designing and populating of ontological knowledge bases. Reasoning engines have been implemented to exploit inference capabilities of ontologies, and semantic-based frameworks have been built to enrich the functionalities of Web services. These achievements are the pillars to deal with the complex challenge of bringing semantics to the Web.

The above gives a new ground to extend the focus of the Semantic Web by engaging it in other communities, where semantics can play an important role. The available semantic knowledge bases can be used to enrich and link additional repositories, ontology engineering techniques can be utilized to properly design and build ontologies in further real-world domains, and inference and query mechanisms can enhance classic information management and retrieval approaches.

Among these communities, this workshop aims to attract the attention of students and professionals both from academia and industry who take benefit of semantic-based techniques and technologies in within-application Adaptation, Personalization and Recommendation approaches. In parallel to the progress made in the Semantic Web research topics, there have been appearing works in the above areas that use ontologies to model the user’s preferences, tastes and interests, and exploit these personal features together with meta-information about multimedia contents in order to provide the user with adaptation and personalization capabilities for different purposes such as information retrieval and item recommendation.

Moreover, with the advent of the Web 2.0 (also called the Social Web), the potential study and development of those approaches have increased exponentially. Social networks allow people to provide explicit relationships with others, and find out implicit user similarities based on their profiles. Social tagging services offer the opportunity to easily create and exploit personal knowledge representations. Wiki-style sites represent an environment where the community contributes and shares information, and blogs are media in which users express subjective opinions.

In all of these scenarios, adaptation, personalization and recommendation are core functionalities. However, the understanding and exploitation of the semantics underlying user and item profiles are still open issues.

++++++++++++++++++
Topics of interest
++++++++++++++++++

The workshop will focus on establishing user/usage models for adaptation, personalization and recommendation approaches for the Social-semantic Web.

Topics of interest include, but are not limited to the exploitation of the Web of Data, the identification of semantics underlying social annotations of multimedia contents, and the application of semantic-based techniques and technologies in research fields related to:

* Personalized access to multimedia content
* Content-based recommendation and collaborative filtering
* Adaptive exploration of multimedia content
* Adaptive user interfaces for multimedia content browsing and searching
* Community extraction and exploitation
* Social networks analysis for collaborative recommendation
* User profile construction based on social tagging information
* Context-aware multimedia content access and delivery
* Mobile and ubiquitous multimedia content access and delivery

++++++++++++++++++++
Organizing Committee
++++++++++++++++++++

* Iván Cantador, Universidad Autónoma de Madrid, Spain
* Peter Mika, Yahoo! Research, Spain
* David Vallet, Universidad Autónoma de Madrid, Spain
* José C. Cortizo, Universidad Europea de Madrid, Spain
* Francisco M. Carrero, Universidad Europea de Madrid, Spain

+++++++++++++++++
Program Committee
+++++++++++++++++

* Sofia Angeletou, Knowledge Media Institute, The Open University, UK
* Ching-man Au Yeung, NTT Communication Science Labs, Japan
* Pablo Castells, Universidad Autónoma de Madrid, Spain
* Manuel Cebrián, Massachusetts Institute of Technology, USA
* Rosta Farzan, Carnegie Mellon University, USA
* Miriam Fernández, Knowledge Media Institute, The Open University, UK
* Enrique Frías, Telefónica I+D, Spain
* Ana García-Serrano, Universidad Nacional de Educación a Distancia, Spain
* Andrés García-Silva, Universidad Politécnica de Madrid, Spain
* Tom Heath, Talis, UK
* Frank Hopfgartner, University of Glasgow, UK
* Ioannis Konstas, University of Edinburgh, UK
* Estefanía Martín, Universidad Rey Juan Carlos, Spain
* Phivos Mylonas, National Technical University of Athens, Greece
* Daniel Olmedilla, Telefónica I+D, Spain
* Carlos Pedrinaci, Knowledge Media Institute, The Open University, UK
* Jérôme Picault, Alcatel-Lucent Bell Labs, France
* Francesco Ricci, Free University of Bozen-Bolzano, Italy
* Sergey A. Sosnovsky, University of Pittsburgh, USA
* Martin Szomszor, City University London, UK
* Marc Torrens, Strands, Spain
* Paulo Villegas, Telefónica I+D, Spain

++++++++++
Organizers
++++++++++

* Universidad Autónoma de Madrid, http://www.uam.es/
* Yahoo! Research, http://research.yahoo.com/
* Universidad Europea de Madrid, http://www.esp.uem.es/gsi/

++++++++
Sponsors
++++++++

* Ministerio de Ciencia e Innovación de España (CENIT-2007-1012), https://i3media.barcelonamedia.org/
* Consorcio MAVIR, http://www.mavir.net
* Sistema Madri+d, http://www.madrimasd.org

+++++++++++++++++++
Contact information
+++++++++++++++++++

Dr. Iván Cantador
Departamento de Ingeniería Informática
Escuela Politécnica Superior
Universidad Autónoma de Madrid, Spain
E-mail: ivan.cantador@uam.es
Phone: +34 91 497 2358

[REMINDER] Special Issue of International Journal of Electronic Commerce on Mining Social Media

2009-12-23T19:19:00.002+01:00

The deadline fot the special issue on Mining Social Media on the International Journal of Electronic Commerce is approaching (deadline for abstract is January 15). If you are interested on publishing on this special issue, please, take care about the dates.

List of Social Tagging Datasets

2009-12-06T10:47:00.002+01:00

Markus Strohmaier is compiling a list of social tagging datasets available for research. Actually the list contains 8 datasets, but it's being actualized according to the comments made on Markus' blog. It seems a good place to find interesting datasets to work on, and also to share the actual datasets we're working on.

From Search to Recommender Systems

2009-12-03T11:10:00.002+01:00

Tech-companies rule the Web, and you can see that analyzing some of the biggest Internet companies like Google and Amazon. Google is the Intelligent Systems reference company due to their search engine, but also for the big quantity and quality of different technologies tehy develop like automated translation, user profiling, context management or even image processing.

Amazon is a e-commerce store, and it seems not so correlated with technology as Google, but the vision of Jeff Bezos and their commitment to technology have allowed them to grow like anyone before in the e-commerce market. For Bezos, an online store should not limit their catalog to a few items, online stores should contain millions of products, and should personalize the user experience of their users. His vision was clear: "if you have 3 million customers in the Web, I should have 3 million online stores", and then the recommender system ruled Amazon.

Both technologies (search and recommender systems) are useful for the Information Overload problem we suffer nowadays. But they're radically different from their conception. Search engines need the users to express their needs in textual form, and then process that query and retrieve the most relevant documents according to that query. Recommender Systems analyze the behaviour (and other kind of data) of the user in a website and then are able to choose the products or contents more likely to interest the user. Both approaches are useful, but until the moment recommender systems are not as popular as search engines.

But we are in a turnaround in the Web, as the way information is generated and consumed has changed. Nowadays, due to the success of Social Media sites like Facebook or Twitter, and even to the success of previous technologies as RSS, we receive a lot of information in a passive way: we don't ask directly to receive that information, but we receive it. Until the moment, the problem was to find some important information, but now the problem is turning into choose what information I already receive is relevant to me. That's why I think recommender systems will replace in popularity to search engines in the future.

1st Spring School on Social Media Retrieval (S3MR)

2009-11-08T23:03:00.001+01:00

DEADLINE: November 17, 2009.

Multimedia content has become ubiquitous on the web, creating new challenges for indexing, access, search and retrieval. At the same time, much of this content is made available on content sharing websites like YouTube or Flickr, or shared on social networks like Facebook. In such environments, the content is usually accompanied with metadata, tags, ratings, comments, information about the uploader and their social network, etc.

Analysis of these "social media" shows a great potential in improving the performance of traditional multimedia information analysis/retrieval approaches by bridging the semantic gap between the "objective" multimedia content analysis and "subjective" users' needs and impressions. The integration of these aspects however is non-trivial and has created a vibrant, interdisciplinary field of research.

The Spring School on Social Media Retrieval aims at bringing together young researchers from neighboring disciplines, offering

(1) lectures delivered by experts from academy and industry providing a clear and in-depth summary of state-of-the-art research in social media retrieval,

(2) collaborative projects in small groups providing hands-on experience on integrative work on selected problems from the field.

Scope

* Content distribution over social/peer-to-peer networks
* Multimedia content analysis
* Automatic multimedia annotation/tagging
* Multimedia indexing/search/retrieval
* Implicit media tagging
* Social data analysis
* Collaborative tagging

Confirmed lecturers:

-Susanne Boll, Carl von Ossietzky Universität, Oldenburg, Germany , http://medien.informatik.uni-oldenburg.de/personen/susanne_boll/
-Roelof van Zwol, Yahoo Research, Barcelona, Spain, http://research.yahoo.com/Roelof_van_Zwol
-Ciro Cattuto, ISI Foundation, Turino, Italy, http://isiosf.isi.it/~cattuto/

for more information and also for subscription please visit our webpage: http://www.petamedia.eu/s3mr/

CFP: Special Issue of International Journal of Electronic Commerce on Mining Social Media

2009-10-30T11:49:00.003+01:00

After the experience of organizing the 1st International Workshop on Social Media (papers now online), we've been organizing a special issue of the IJEC (International Journal of Electronic Commerce) on Mining Social Media. Now we release the CFP hoping to receive high quality papers on Mining Social Media:

OVERVIEW

Recently, Forrester published a report, “The Future of the Social Web” where they sketched a timeline of the development of the Social Web, dividing its evolution in 5 eras. According to that report, the first era of the development of the Social Web started to explode the social relationships among users. Then, in the social functionality era, these social relationships resulted in the social functionality era where several websites started to add social functionalities in order to help users to interact with their peers. We are now in the era of Social Colonization, where technologies like Facebook Connect or Google Friend Connect have standardized social functionalities among websites and a vast majority of websites now include several social functionalities. Soon these federated identities will empower people to enter the era of social context with personalized and social content, and the development of tools for personalize social content will aim the development of the era of social commerce.

The primary goal of the proposed special issue of International Journal of Electronic Commerce is to foster research in the interplay between Social Media, Data Mining and Electronic Commerce, trying to reflect the actual developments on technologies that fit on the Social Context era.

SCOPE

The International Journal of Electronic Commerce is the #1-ranked journal on Electronic Commerce globally. This Special Issue will provide a significant opportunity for authors to publish important novel and original contributions in the area of Data Mining applied to Social Media. The guest editors seek papers and proposals that address various aspects of Mining Social Media, including recommender systems for social media, data mining algorithms designed to explode Social Networks, information management for Social Networks, etc.

RESEARCH QUESTIONS

We invite scholars and professionals from a broad range of disciplines to submit to this Special Issue. Papers may encompass any or all of the following: foundational theoretical analyses, modelling, simulation, and empirical studies. Authors may examine different aspects of mining social media in any of a variety of possible contexts. Special topics of interest include, but are not limited to, the following:

A. Data Mining for Social Networks

• Novel Algorithms
• Association Rules
• Mining semi-structured data
• Classification and Ranking
• Clustering
• Text Mining
• Machine Learning
• Privacy Preserved Data Mining
• Statistical Methods
• Temporal and spatial data mining
• Parallel and Distributed Data Mining
• Interactive and Online Mining
• Data and Knowledge Visualization
• Multimedia mining (audio/video)
• Ensemble Methods
• Web Mining
• Graph Mining
• Link Mining

B. Information Management for Social Networks

• Recommender Systems
• Information Retrieval
• Sentiment Analysis
• Natural Language Processing
• Question Answering
• Semantic Processing
• Graph Analysis and Complex Networks
• Social Network Analysis

C. Possible applications

• Electronic Commerce
• E-Mail Spam Detection
• Blog/Social Networks Spam Detection
• Community Detection
• Users/content recommenders
• Trends discovery
• Blogs/Social Networks Community Dynamics
• User Reviews Ranking
• Blogs/Social Networks Contributions Summarization
• Abuse/Fraud Detection
• User Profile Modelling
• Event Detection and Tracking in Social Media
• Online Advertising

SUBMISSION GUIDELINES

Manuscripts submitted to the special issue should contain original material not published in nor submitted to other journals. Each manuscript has to have a cover page with the author information and another page with title and abstract but the author information omitted. The review process is double-blind and papers which do not meet publication quality standards will be rejected before the review process.

Interested authors are required to submit extended abstracts of no more than two pages for their planned submissions. This will give the editorial team an opportunity to determine if a given submission is appropriate for expedited handling and review.

Full papers should be sent via e-mail to Jose Carlos Cortizo <josecarlos.cortizo@wipley.com> in anonymized PDF Format, not including any author names or affiliations, and should not exceed 40 pages.

IMPORTANT DATES

Abstracts DeadLine: 15 January 2010
Abstracts Feedback: 30 January 2010
Full Paper Submission: 15 April 2010
Revision Notification: 1 June 2010
Revised Manuscripts: 1 August 2010
Final Decision: 1 October 2010

1st International Workshop on Mining Social Media Programme

2009-10-12T22:37:00.001+02:00

While we are still working on the final proceedings to be published in Bubok, and in the post-workshop special issue on a journal to be announced soon, we have the final version of the programme of the Mining Social Media Workshop. If you are interested on Mining Social Media, this will be a very good place to meet with other researchers and practicioners. Registration is open.

9:30 - 11:00; Keynote speaker, William W. Cohen
11:00 - 11:30; Coffee Break
11:30 - 13:30; 6 paper presentations (20 minutes per paper)
- "Using prediction Markets and Twitter to predict a Swine Flu Pandemic", Joshua Ritterman, Miles Osborne and Ewan Klein
- "Comparison of Rule-based to Human Analysis of Chat Logs", April Kontostathis, Lynne Edwards, Jen Bayzick, India McGhee, Amanda Leatherman and Kristina Moore
- "Detecting Blogs Independently from the Language and Content", Francisco Manuel Rangel and Anselmo Peñas
- "Improve Web Search Ranking with Social Tagging", Shihn-Yuarn Chen and Yi Zhang
- "Combining Tag Cloud Learning with SVM Classification to Achieve Intelligent Search for Relevant Blog Articles", Ahmad Ammari and Valentina Zharkova
- "Folksonomy Analyzer: a FCA-based Tool for Conceptual Knowledge Discovery in Social Tagging Systems", Kyoung-Mo Yang, Suk-Hyung Hwang, Yu-Kyung Kang, Hae-Sool Yang
13:30 - 15:30; Lunch Break
15:30 - 17:00; 4 paper presentations (20 minutes per paper)
- "Fundamental operations for organizing resource groups in Grouped folksonomy", Yu-Kyung Kang, Suk-Hyung Hwang and Hae-Sool Yang
- "A Comparison of Approaches to Determine Topic Similarity of Weblogs for Privacy Protection", Dong Yi Wu and Amanda Stent
- "Data-Driven Ontologies for Recommender Engines in Social Networks", Ingo Bax and János Moldvay
- "Expert Stock Picker: The Wisdom of (the Experts in the) Crowds", Shawndra Hill, Noah Ready-Campbell
17:00 - 17:30; Coffee Break
17:30 - 19:30; Industry Panel with Tuenti, Strands and Optenet

Innovation in Search and Artificial Intelligenc

2009-09-04T14:49:00.002+02:00

The first time I read the name Peter Norvig was when I bought the "Artificial Intelligence: A Modern Approach" book, when I was 18; and for me, he is one of the most brilliant researchers in AI. In this talk, Peter Norvig (also Research Director at Google), resumes some of the last advances in AI and Internet search, which allow us to develop new models to manage huge quantities of data.

Funded PhD position in Dynamic Network Analysis (Ireland)

2009-08-25T10:52:00.002+02:00

The Unit for Information Mining and Retrieval (http://uimr.deri.ie) invites applications for a funded PhD Studentship as part of the Clique Research Cluster at DERI. At the Clique Research Cluster (http://www.cliquecluster.org), we are investigating and analysing how very large real-life social networks, on-line forums, biological networks and other networks of interest evolve. Some areas we are interested in include:

Analysing how communities in these networks form and change with time;
Analysing how information and innovation diffuses and formulating models to describe the observed diffusion behaviour;
Analysing churn in online communities and mobile call networks.

The Candidate

We are seeking applications for a PhD candidature in dynamic graph analysis. The successful candidate will analyse how particular properties of the networks change, and use network changes to detect abnormal events or to predict how information diffusion is enhanced or hampered. The successful candidate should have a bachelors degree in computer science, maths, science or engineering, and have the pre-requisites for PhD studies at NUI Galway (http://www.nuigalway.ie).

The PhD studentship covers academic fees and includes a generous stipend for a four year period. In addition, desired, though not necessary, requirements are:

Familiarity with basic graph theory (e.g., finding connected components, shortest paths);
Familiarity with modelling and simulation;
Familiarity with social network analysis;
Familiarity with dynamic data analysis (e.g., data streaming algorithms, incremental algorithms);
Familiarity with text mining, feature extraction and machine learning;
Masters or equivalent degree in graph analysis, modelling or social network analysis.

The successful candidate will work with the PI Dr. Conor Hayes and Dr. Jeffrey Chan as part of the Clique Research Cluster at DERI, NUI Galway.

Application

Interested applicants should send an application with the subject header CLIQUE_PhD_09 to conor.hayes@deri.org. The application should contain a CV, a one page statement explaining how the candidate's background is compatible with the aims of the Clique Research Cluster and a list of references.

Vía Social Media Research Mail-list

MSM09, Deadline Extended until September 6

2009-08-17T13:31:00.003+02:00

The submission deadline for the 1st International Workshop on Mining Social Media has been extended until 6th of September. If you're working on any possible application of data mining techniques, or even recommender systems, information retrieval or any other Information Access technique to Social Media, this is a very good place to submit your work.

New Book: Modelling and Data Mining in Blogosphere

2009-07-31T10:26:00.002+02:00

A new Data Mining for Social Media book has been released. Authored by Nitin Agarwal (University of Arkansas at Little Rock) and Huan Liu (Arizona State University), "This book offers a comprehensive overview of the various concepts and research issues about blogs or weblogs. It introduces techniques and approaches, tools and applications, and evaluation methodologies with examples and case studies".

ISBN: 9781598299083 paperback
ISBN: 9781598299090 ebook

Online version available:

Table of Contents:

Chapter 1: Modeling Blogosphere
Chapter 2: Blog Clustering and Community Discovery
Chapter 3: Influence and Trust
Chapter 4: Spam Filtering in Blogosphere
Chapter 5: Data Collection and Evaluation
Appendix A: Tools in Blogosphere
Appendix B: API Examples

The Lemur Query Log Project

2009-07-29T13:12:00.002+02:00

Jose Maria Gomez has published in his blog about the Lemur Query Log Project, which is a very interesting iniciative leaded by Dr. Bruce Croft. The Lemir Query Log Project features a toolbar that collect queries and related navigation from users and send it to a database which collects a massive query log that may benefits the IR research community.

Information Retrieval, as most of the subdisciplines related to intelligent information access, relies on the availability of data, more specifically on testing datasets. That's the reason why projects like Lemur Query Log are so important for future researchs and developments.

Frauds in Science

2009-06-13T11:21:00.004+02:00

A month ago I wrote a post in my Spanish Intelligent Systems blog about frauds in science. In this post I resume what I wrote because I think it can be a good initial point for a debate about the present and future of Science.

Talking about frauds in science can be quite long, there are many little things in the actual scientific process that should be corrected (fake conferences, strange publishing processes, etc.), but I'll focus on big frauds,.

In August 2005, PLoS Medicine published "Why Most Published Research Findings are False", dealing with bad experimental design which conduces to wrong research findings. As stated in this paper, the scientific process is much focused in nobel research, and there is almost no support for research trying to replicate previous results, trying to corroborate previous findings.

In the paper "Repairing research integrity" published in Nature, June 2008, Sandra Titus and her team analyzes the integrity on scientific studies. Based on a survey over more than 4.000 researchers from over 600 institutiones, the results showed more than 200 cases of bad conduct in some scientific study, a number much higher than the registered previously by ORI (Office of Research Integrity). More than 60% of the total meant to data falsification, being plagiarism the next more usual bad conduct detected. Some of these frauds are detected on time, as the Kristin Roovers case, that was discovered by the editors of The Journal of Clinical Investigations when he sent a paper containing some images that had been manipulated with Photoshop.

There exists some regions where frauds are even a bigger problem. This is the case of China, where more than 60% of PhD students admit they have plagiarize some work. This represents a really big problem for China's research and even for the whole scientific community.

Recently, another big fraud in science was discovered, when The Scientist informed that Elsevier, one of the biggest scientific publishers, has several agreements with companiers to publish scientific journals that the companies use to promote their products. The first case detected was the Australasian Journal of Bone and Joint Medicine, where a paper was published promoting a product from Merck, a company that paid Elsevier for designing this journal. Summer Johnson writes about this big fraud in Bioethics, a really recommended lecture.

The scientific community must react to all these things, if we want to preserve the image of science but, what can we do? I think there are several options that could improve the scientific process:

1.- Open Access. The Elsevier case must make us to think that letting companies like Elsevier to control scientific publishing is not a good idea. Open Access seems a good way to prevent science from the desires and interests of big publishing companies. It is also a good way to assure an egalitarian access to scientific results.

2.- We should try to help iniciatives refering to negative results (or less important ones). Journals like Journal of Interesting Negative Results in Natural Language Processing and Machine Learning, Journal of Negative Results on Biomedicine, or Journal of Negative Results, are doing a good job publishing that kind of results.

3.- It also seems very important to improve working conditions of researchers. For instance, in Spain a lot of researchers earn less money that if they were working in a supermarket or driving a taxi, occupations with less responsabilities and less impact in the society. Who can care about doing high quality research if can't give his/her family a decent living?

4.- It is also needed to take up again scientific ethics. As researchers we must value what science word means. Science is not about publishing papers, science is all about improving the global knowledge, science is something really great.

Wikipedia Page Traffic Statistics for DataMiners

2009-06-13T10:43:00.002+02:00

Gregory Piatetsky pointed in KDnuggets Twitter account the release of a data package containing 7 months of hourly pageview statistics for all articles in Wikipedia. This dataset has a compressed size over 320 GB, over 1 TB uncompressed, and includes 7 months of hourly page traffic statistics for over 2.5 Million Wikipedia articles. All text content, statistics and link data in the dataset are licensed under GFDL (GNU Free Document License).