<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Sonamine's Blog</title>
	<atom:link href="http://sonamine.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://sonamine.wordpress.com</link>
	<description>Commercial applications of graph theory</description>
	<lastBuildDate>Thu, 05 Nov 2009 21:41:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='sonamine.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Sonamine's Blog</title>
		<link>http://sonamine.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://sonamine.wordpress.com/osd.xml" title="Sonamine&#039;s Blog" />
	<atom:link rel='hub' href='http://sonamine.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Using network analysis to predict private company revenues</title>
		<link>http://sonamine.wordpress.com/2009/11/05/using-network-analysis-to-predict-private-company-revenues/</link>
		<comments>http://sonamine.wordpress.com/2009/11/05/using-network-analysis-to-predict-private-company-revenues/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 21:41:27 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Financial records]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[network structure]]></category>
		<category><![CDATA[predictive analysis]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=244</guid>
		<description><![CDATA[Ever wonder how newspapers and analysts obtain information about private companies?  It&#8217;s still a mystery to me but these researchers have figured out a way to triangulate this information from news reports about public companies!  The basic intuition is that using the wisdom of the crowds, in this case reporters and journalists, news stories tend [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=244&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Ever wonder how newspapers and analysts obtain information about private companies?  It&#8217;s still a mystery to me but these researchers have figured out a way to triangulate this information from news reports about public companies!  The basic intuition is that using the wisdom of the crowds, in this case reporters and journalists, news stories tend to connect different companies together.  Perhaps these news stories could become good predictive variables for revenue estimation.  Here&#8217;s what they did.</p>
<p>They first constructed a social network of both public and private companies based on news stories.  Companies that were mentioned in the same story were connected by a link.   Eight months of news stories on Yahoo Finance were used.  The more mentions of two particular companies together, the stronger the weight of the link between these two companies.   For two datasets, this method generated 2 lists of over 6000 companies each.</p>
<p>Then they used these social network to generate certain scores for each company.  These scores included weighted in-degree, weighted out-degree, out-degree minus in-degree, pagerank, HITS and betweenness.</p>
<p>These social network scores, along with publicly available financial information about a portion of the companies, were fed into a data mining engine using logistic regression and decision tree.  The goal was to see if they could classify the CRR (the company revenue relation between any 2 companies).</p>
<p><strong>Results</strong></p>
<p>They evaluated the performance of the predictions using 3 methods:</p>
<p>precision = number of correctly predicted positive (negative) instances / number of predicted positive (negative) instances ;<br />
recall =number of correctly predicted positive (negative) instances / number of actual positive (negative) instances ;<br />
accuracy = number of correctly predicted instances / number of instances</p>
<p>All these came out to be between 70-80%!</p>
<p>&#8220;The two dyad degree-based attributes, NWD and WDOD, fail to predict revenue relations well, whereas the four node degree-based and six node centrality-based attributes produce results nearly as good as those from using all 12 attributes together.&#8221;</p>
<p><strong>So what</strong></p>
<p>Many financial analysts who do not have access to the financials of private companies still need to perform valuation analysis.  This method of combining news and public company financial information provides another way to forecast the revenues of private companies.</p>
<p><strong>References</strong></p>
<p>Zhongming Ma, Olivia R.L. Sheng , Gautam Pant.  Discovering company revenue relations from news: A network approach.  Decision Support Systems.  2009 vol 47, p408-414.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/244/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/244/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/244/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=244&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/11/05/using-network-analysis-to-predict-private-company-revenues/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>
	</item>
		<item>
		<title>Launched Sonamine Telco Churn Predictor</title>
		<link>http://sonamine.wordpress.com/2009/09/23/launched-sonamine-telco-churn-predictor/</link>
		<comments>http://sonamine.wordpress.com/2009/09/23/launched-sonamine-telco-churn-predictor/#comments</comments>
		<pubDate>Wed, 23 Sep 2009 07:08:06 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[churn prediction]]></category>
		<category><![CDATA[mobile operator]]></category>
		<category><![CDATA[social network analytics]]></category>
		<category><![CDATA[telco]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=242</guid>
		<description><![CDATA[Short post &#8211; we launched Sonamine Telco Churn Predictor at Prepaid Mobile 2009 show in Lisbon yesterday.  It was a great rush and already we have demoed it to many interested telcos.  B-eye-Network picked it up in their news and we expect more news to follow.  Stay tuned.  BTW, you can read more on our [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=242&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Short post &#8211; we launched Sonamine Telco Churn Predictor at Prepaid Mobile 2009 show in Lisbon yesterday.  It was a great rush and already we have demoed it to many interested telcos.  B-eye-Network picked it up in their news and we expect more news to follow.  Stay tuned.  BTW, you can read more on our www.sonamine.com website.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/242/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=242&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/09/23/launched-sonamine-telco-churn-predictor/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>
	</item>
		<item>
		<title>Information diffusion through social networks</title>
		<link>http://sonamine.wordpress.com/2009/08/18/information-diffusion-through-social-networks/</link>
		<comments>http://sonamine.wordpress.com/2009/08/18/information-diffusion-through-social-networks/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 16:23:05 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Social networks]]></category>
		<category><![CDATA[viral marketing]]></category>
		<category><![CDATA[cascades]]></category>
		<category><![CDATA[Information diffusion]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=234</guid>
		<description><![CDATA[Back from relaunching http://www.sonamine.com&#8230;and free eval product. How do companies &#8220;get the word out&#8221; about a new product?  Used to be in the old days, you put out a press release and then you track the number of &#8220;press hits&#8221;.  Then you estimate the readership of these press hits to find overall coverage of the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=234&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Back from relaunching <a href="http://wwww.sonamine.com" target="_blank">http://www.sonamine.com</a>&#8230;and free eval product.</p>
<p>How do companies &#8220;get the word out&#8221; about a new product?  Used to be in the old days, you put out a press release and then you track the number of &#8220;press hits&#8221;.  Then you estimate the readership of these press hits to find overall coverage of the press release.</p>
<p>The new web20 social world essentially provides a more advanced form of this PR release&#8230;Thanks to a flowering of social media consulting sites, every business is posting information on facebook, twitter etc&#8230;  Do you ever wonder how much this information spreads?   How can we compare the old PR methods with the new social PR methods?  After all, we are getting past the hype peak for social media&#8230;</p>
<p>This study looks at how pictures were &#8220;faved&#8221; in Flickr and how subsequent social friends also faved it.  Here&#8217;s some data details first : 100 days of Flickr snapshots, covering about 25% of total user base which grew from 1.6M to 2.5M during that period, there were 34.7M markings of &#8220;faved&#8221; on 11.3M photos.  By timing when the photo was faved, researchers could study propagation of these photos through the network.</p>
<p>Questions being asked : How long does it take to spread to first user; then to subsequent users?  What are some of the factors involved.  A social cascade was defined as user A faving a photo, A being friend of B, and B faving the same photo before friending A.  Basically, A found (and faved) the photo by seeing her friend B&#8217;s fav photos.</p>
<p><img class="aligncenter size-full wp-image-237" title="cascade1" src="http://sonamine.files.wordpress.com/2009/08/cascade1.png?w=500&#038;h=319" alt="cascade1" width="500" height="319" />50% of the first cascade step occurred in less than 3 days, while 50% of the subsequent cascade step occurred in 50 days.  &#8221;an order of magnitude larger than the time before the first step of the cascade.&#8221;</p>
<p>Good so far, but what I found really intriguing was the application of epidemic theory here.  Reproductive rate is the number of subsequent infections that result from a single infected person.  It is sort of a multiplier effect.  Cha et. al. tried to create a predictive function of this number, so that they could predict it based on known properties of the social network.   In particular,  they tried to create predictive function based on the degree (think friends) of each node and the % of friends that were infected.   The correlation between the predicted value and the empirical value is shown below.</p>
<p><img class="aligncenter size-full wp-image-238" title="cascade1" src="http://sonamine.files.wordpress.com/2009/08/cascade11.png?w=500&#038;h=348" alt="cascade1" width="500" height="348" />Basically, for small values, the predictive function is pretty good.  In the words of the authors,  &#8221;Given the transmission probability of a picture derived from a short time series of user activity, we can then predict the expected spread of the photo for any network structure for which the node degree distribution k is known. This means that we can predict the reproduction number and the resulting spread of these 1,000 photos when they are adopted into other online social networks such as Facebook, Livejournal, and MySpace.&#8221;</p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:120px;width:1px;height:1px;">34,734,221 favorite markings over 11,267,320</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:120px;width:1px;height:1px;">distinct photos</div>
<p><strong>So what</strong></p>
<p>Direct online marketers can understand how information spreads within your house email list.  It&#8217;s easy to run a series of seed campaigns, track the forward links, utilize these techniques to document the &#8220;viral&#8221; nature of their house list.  Also you can use the same model to understand and track the &#8220;earned media&#8221; value of social networks.</p>
<p><strong>More work</strong></p>
<p>It would be really interesting next to see how the local neighborhood characteristics of the infectees correlated with the initial infection lag time&#8230;more to come.</p>
<p><strong>Reference</strong></p>
<p>Characterizing social cascades in Flickr.  Cha, M., Mislove, A., Adams, B., Gummadi, K.P.   WOSN’08, August 18, 2008, Seattle, Washington, USA</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/234/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/234/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/234/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=234&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/08/18/information-diffusion-through-social-networks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/08/cascade1.png" medium="image">
			<media:title type="html">cascade1</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/08/cascade11.png" medium="image">
			<media:title type="html">cascade1</media:title>
		</media:content>
	</item>
		<item>
		<title>Social network structure predicts churn in mobile telecommunications</title>
		<link>http://sonamine.wordpress.com/2009/07/15/social-network-structure-predicts-churn-in-mobile-telecommunications/</link>
		<comments>http://sonamine.wordpress.com/2009/07/15/social-network-structure-predicts-churn-in-mobile-telecommunications/#comments</comments>
		<pubDate>Wed, 15 Jul 2009 14:26:06 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Social graph]]></category>
		<category><![CDATA[Social networks]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[viral marketing]]></category>
		<category><![CDATA[churn]]></category>
		<category><![CDATA[predictive analysis]]></category>
		<category><![CDATA[telco]]></category>
		<category><![CDATA[word of mouth]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=217</guid>
		<description><![CDATA[When you change a mobile service plan or phone number, the first thing you&#8217;ll probably do is let your friends know, just in case they encounter problems reaching you at your new number or new service plan.  So it&#8217;s natural to think that this &#8220;churn&#8221; behavior would propagate through the social network. This study studies whether [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=217&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>When you change a mobile service plan or phone number, the first thing you&#8217;ll probably do is let your friends know, just in case they encounter problems reaching you at your new number or new service plan.  So it&#8217;s natural to think that this &#8220;churn&#8221; behavior would propagate through the social network.</p>
<p>This study studies whether this &#8220;churn&#8221; spreads through a call graph network, and how you could use such a call graph to predict people who will churn in subsequent months.  Some basics first.  Study data is a small subset from a large mobile carrier during Mar 2007; the final size after some pruning and cleaning was 2.1 million users and 9.3 million calls.  Many of these users are prepaid card users for whom the carrier has no demographic information.  The authors then studied how probability of churning in April, May, June and July is related to each user&#8217;s local neighborhood of friends.</p>
<p>Result 1 &#8211; probability of churning is closely related to how many of your neighbors have churned.  The diagram below shows that your likelihood to churn rises steadily with the number of churner neighbors until a plateau.</p>
<p><img class="aligncenter size-full wp-image-221" title="churn neighbors" src="http://sonamine.files.wordpress.com/2009/07/churn-neighbors1.jpg?w=500&#038;h=390" alt="churn neighbors" width="500" height="390" />Result 2 &#8211; probability of churn is related to the number of neighbors who have churned AND are also connected.</p>
<p>The authors then proceed to study how certain variables, including social network metrics, might accurately predict people who churn.  They used a standard decision tree classifier.  This is a standard supervised machine learning or data mining problem.  There were 3 types of input data</p>
<ol>
<li>Usage information (called DT1) &#8211; call frequency, number of calls, number of friends, call volume, duration of calls etc</li>
<li>Connectivity (called DT2) &#8211; These are usage statistics but broken down by churner or non-churner neighbors.  So 2 examples are &#8220;number of churner neighbors&#8221; and &#8220;number of non-churner neighbors who have churners as neighbors&#8221;  The second example is closely related to the concept of eigenvector centrality of the churner network.</li>
<li>Inter-connectivity (called DT3) &#8211; These are network or graph type metrics which can only be calculated using graph methods.  &#8221;Number of adjacent pairs in the set of churner neighbors&#8221;, &#8220;Number of pairs in the churner friends connected by path length of 2&#8243;, number of pairs of churner friends whose shortest paths only include churner neighbors, &#8220;Total call volume on edges connecting adjacent churner friends&#8221;.</li>
</ol>
<p>Results &#8211; Much more accurate churn prediction using social network metrics D3 and D2, compared to D1 alone.  See graph below.</p>
<p><img class="aligncenter size-full wp-image-222" title="churn neighbors" src="http://sonamine.files.wordpress.com/2009/07/churn-neighbors2.jpg?w=500&#038;h=373" alt="churn neighbors" width="500" height="373" />Recall that a lift curve shows how many of the actual churners were predicted by the decision tree classifer as you walk through all the subscribers in the dataset.  When you walk through all 100% of the subscribers, you will catch 100% of the churners.  But that is not very good, you need many marketing promotions or calls to use that.  On the other hand, if you can identify 50% of the churners after walking through 10% of the subscribers, that&#8217;s pretty good.</p>
<p>The graph shows that using usage data (DT1), after walking through 10% of the base, they could predict about 10% of the churners.  Not good.</p>
<p>Using DT2, after walking through 10% of the base, they could predict about 18% of the churners.  A good improvement, but still not good.</p>
<p>Using DT3, after after walking through 10% of the base, they could predict about 40% of the churners.  This is very good.</p>
<p><strong>Nick&#8217;s back-of-the-envelop business case calculation on why this matters</strong></p>
<table border="1" cellspacing="0" cellpadding="0" width="491">
<tbody>
<tr>
<td width="309" valign="bottom"></td>
<td width="91" valign="bottom">
<p align="center">DT1</p>
</td>
<td width="91" valign="bottom">
<p align="center">DT3</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Total   subscriber count</td>
<td>
<p align="right">2,000,000</p>
</td>
<td>
<p align="right">2,000,000</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Churn   rate</td>
<td>
<p align="right">0.06</p>
</td>
<td>
<p align="right">0.06</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">#   churners</td>
<td>
<p align="right">120,000</p>
</td>
<td>
<p align="right">120,000</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">% of   churners identified in 10% of subscribers</td>
<td>
<p align="right">0.1</p>
</td>
<td>
<p align="right">0.4</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">#   churners identified</td>
<td>
<p align="right">12,000</p>
</td>
<td>
<p align="right">48,000</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Conversion   rate of discount offer</td>
<td>
<p align="right">0.1</p>
</td>
<td>
<p align="right">0.1</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">#   churners converted/saved</td>
<td>
<p align="right">1,200</p>
</td>
<td>
<p align="right">4,800</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">#   churners lost</td>
<td>
<p align="right">118,800</p>
</td>
<td>
<p align="right">115,200</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Discount   offer Rate</td>
<td>
<p align="right">0.05</p>
</td>
<td>
<p align="right">0.05</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom"></td>
<td>
<p align="right">
</td>
<td>
<p align="right">
</td>
</tr>
<tr>
<td width="309" valign="bottom">Revenue   from saved churners calculated as (1-discount offer)*number of saved churners</td>
<td>
<p align="right"><strong> 1,140 </strong></p>
</td>
<td>
<p align="right"><strong> 4,560 </strong></p>
</td>
</tr>
<tr>
<td width="309" valign="bottom"></td>
<td>
<p align="right">
</td>
<td>
<p align="right">
</td>
</tr>
<tr>
<td width="309" valign="bottom">Number of   non churners in 10% of subscribers</td>
<td>
<p align="right">188,000</p>
</td>
<td>
<p align="right">152,000</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Number of   non churner who took the discount</td>
<td>
<p align="right">18,800</p>
</td>
<td>
<p align="right">15,200</p>
</td>
</tr>
<tr>
<td width="309" valign="bottom">Lost   revenue from discount to non-churners</td>
<td>
<p align="right"><strong>-940</strong></p>
</td>
<td>
<p align="right"><strong>-760</strong></p>
</td>
</tr>
<tr>
<td width="309" valign="bottom"></td>
<td valign="bottom"></td>
<td valign="bottom"></td>
</tr>
<tr>
<td width="309" valign="bottom">Return on   investment on churn marketing campaign</td>
<td valign="bottom"><strong> 200 </strong></td>
<td valign="bottom"><strong> 3,800 </strong></td>
</tr>
</tbody>
</table>
<p>Using DT1, it is probably not worth the effort to organize a marketing campaign to save a net of 200 subscribers.  But with DT3, it becames a profitable option to give a 5% discount offer to churners!</p>
<p><strong>So what?</strong></p>
<p>So if this social network prediction works, why aren&#8217;t all the telco providers jumping on this?  There are some companies like Idiro, Xtract and Datanetis that are starting to offer churn solutions around this idea.</p>
<p>I think there are 2 main reasons why the telco uptake has been slow.</p>
<p>(1) scalability, performance and reliability &#8211; telcos don&#8217;t just have 2 million subscribers, most have tens of millions of subscribers.  This type of social network analytics must score the data in minutes to have any operational value.</p>
<p>Shameless plug here : Check out <a href="http://www.sonamine.com" target="_blank">http://www.sonamine.com</a> for a high performance engine that can handle hundreds of millions of nodes and connections in minutes instead of hours.</p>
<p>(2) telcos have been battling churn for a long time and have their own processes that work.  Using social network analytics must therefore integrate with their current processes.  Complete solutions that spawn a different marketing and analytical process will be difficult to integrate and show proof of success.</p>
<p><strong>References</strong></p>
<p>Social ties and their relevance to churn in mobile telecoms networks.  K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. Nanavati, A. Joshi.  EDBT&#8217;08.  March 25-30, 2008.  Nantes, France.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/217/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/217/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/217/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=217&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/07/15/social-network-structure-predicts-churn-in-mobile-telecommunications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/07/churn-neighbors1.jpg" medium="image">
			<media:title type="html">churn neighbors</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/07/churn-neighbors2.jpg" medium="image">
			<media:title type="html">churn neighbors</media:title>
		</media:content>
	</item>
		<item>
		<title>Smokers quit together as a connected cluster in a social network</title>
		<link>http://sonamine.wordpress.com/2009/06/08/smokers-quit-together-as-a-connected-cluster-in-a-social-network/</link>
		<comments>http://sonamine.wordpress.com/2009/06/08/smokers-quit-together-as-a-connected-cluster-in-a-social-network/#comments</comments>
		<pubDate>Mon, 08 Jun 2009 18:19:28 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Public health]]></category>
		<category><![CDATA[Social graph]]></category>
		<category><![CDATA[smoking]]></category>
		<category><![CDATA[Social networks]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=207</guid>
		<description><![CDATA[Smoking in the US has dropped from 45% to 21% in the past 4 decades.  The latest research in smoking cessation studies how social networks influence the likelihood of quitting.   In this study, the authors were examining 6 areas :  (1) are there social network clusters of smokers and non-smokers (2) relationship between one [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=207&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Smoking in the US has dropped from 45% to 21% in the past 4 decades.  The latest research in smoking cessation studies how social networks influence the likelihood of quitting.   In this study, the authors were examining 6 areas :  (1) are there social network clusters of smokers and non-smokers (2) relationship between one person&#8217;s smoking behavior and the smoking behavior of his/her social network contacts (3) the dependence of 2 on the types of social ties (4) the influence of education on the spread of smoking (5) does cessation behavior occur in sub-networks (6) do smokers occupy special positions on the the social network?</p>
<p>The data consisted of 12067 subjects over a period of 32 years.  Individual behaviors were collected every few years.   Clusters of smokers were identified by finding smokers that were &#8220;fully connected&#8221;.   In graph theory, this translates to a complete graph.  By calculating certain network attributes such as eigenvector centrality, the researchers created a logistic regressions to identify the effect of centrality on smoking behavior.</p>
<p><strong>Results</strong><br />
<span style="font-weight:normal;">Although the % of smokers dropped by 50%, the cluster size of smokers remained relatively unchange.  This indicates that smokers are quitting in connected social groups.   </span></p>
<p><img class="aligncenter size-full wp-image-211" title="smokers1" src="http://sonamine.files.wordpress.com/2009/06/smokers1.png?w=500" alt="smokers1" />At the same time, the eigenvector centrality of non smokers fell, relegating them to the periphery of the network.  </p>
<p><img class="aligncenter size-full wp-image-212" title="smokers2" src="http://sonamine.files.wordpress.com/2009/06/smokers2.png?w=387&#038;h=350" alt="smokers2" width="387" height="350" /></p>
<p>Among the various other findings, I find this one most interesting:  They found that geographical &#8220;distance did not modify the intensity of the effect of the contact&#8217;s smoking behavior on the behavior of the subject.  That is, smoking behavior was related between subjects and their contacts, regardless of how far apart they were geographically.&#8221;</p>
<p><strong>So what?</strong></p>
<p><strong><span style="font-weight:normal;">There are various smoking cessation social networks out there such as quitnet.com etc.  Since there is evidence that geographical distance does not alter the effect of social networks, we can hypothesize that online social networks might have the same effect as real life face-to-face networks.  Each of these online quitting networks can therefore further tailor their offerings if they can mine their network data and score their users with network &#8220;smoking&#8221; attributes.  Artificially creating clusters of quitters from the isolated smokers might be a viable intervention option. </span></strong></p>
<p><strong>Reference</strong></p>
<p>The collective dynamics of smoking in a large social network.  Kristakis, N. and Fowler, J. H.  <em>New England Journal of Medicine</em>  2008:358:2249-58</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/207/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=207&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/06/08/smokers-quit-together-as-a-connected-cluster-in-a-social-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/06/smokers1.png" medium="image">
			<media:title type="html">smokers1</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/06/smokers2.png" medium="image">
			<media:title type="html">smokers2</media:title>
		</media:content>
	</item>
		<item>
		<title>Optimizing restoration capacity in AT&amp;T Network</title>
		<link>http://sonamine.wordpress.com/2009/06/01/optimizing-restoration-capacity-in-att-network/</link>
		<comments>http://sonamine.wordpress.com/2009/06/01/optimizing-restoration-capacity-in-att-network/#comments</comments>
		<pubDate>Mon, 01 Jun 2009 15:21:33 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Routing problems]]></category>
		<category><![CDATA[Operations Research]]></category>
		<category><![CDATA[shortest path]]></category>
		<category><![CDATA[operations]]></category>
		<category><![CDATA[restoration capacity]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=194</guid>
		<description><![CDATA[The AT&#38;T communications network is a complex sprawl of fiber optic cables linked by routing stations and nodes.  In 1997, AT&#38;T handled anywhere from 250 to 290 million calls per day.  Amazingly, 99.98% of all calls were completed on the first attempt.  To achieve this feat, AT&#38;T uses a patented Real Time Network Routing (RTNR) [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=194&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The AT&amp;T communications network is a complex sprawl of fiber optic cables linked by routing stations and nodes.  In 1997, AT&amp;T handled anywhere from 250 to 290 million calls per day.  Amazingly, 99.98% of all calls were completed on the first attempt.  To achieve this feat, AT&amp;T uses a patented Real Time Network Routing (RTNR) algorithm which is implemented by the Fast Automatic Restoration (FASTAR) system.  These algorithms and systems allow network planners to include restoration capacity into the implementation of the network, and also allow the routing nodes in the system to redirect traffic to spare capacity routes when there are any failures in the network, whether they are from fibre cuts or power outages. </p>
<p>Deciding how much restorative capacity to include and how to route affected calls is a complex network (graph) problem.  FASTAR works by first having sentinel nodes tell it when there are failed fibre or nodes.  FASTAR then works out the affected demand, ie. calls, and where their best alternative restorative path should be.  FASTAR proceeds to send new instructions to the routing nodes for the affected calls, to send the calls down the alternate path.  </p>
<p>AT&amp;T labs and their network planning team came up with an alternate way to model this restorative capacity problem using a graph model and linear programming.  The constraints were two fold &#8211; to restore as many affected calls as possible; and a cost minimization to implement the restorative capacity.   The final network being modeled here was pretty &#8216;large&#8217; &#8211; hundreds of nodes, thousands of connected arcs, tens of thousands of demand units.  The resulting LP model had millions of variables and constraints.  As a result, data aggregation was performed to ensure quick analysis times.  </p>
<p>It took 9 months to build the model.  The LP run using aggregated data occured on a Sun enterprise 3000 machine with 2GB of RAM using CPLEX v4 as the LP solver completed in 2 hours.  The unaggregated data run took 6 CPU days. </p>
<p><strong>Results</strong></p>
<p>Using 1997 network plan as a baseline, the new model found that it needed 39% fewer additions to the network than the current system!  For the 1998 planning session, the new model showed savings of 36% without any degradation in performance.  With no incremental new builds and some small additional cabling, the model returned over 6800 demand units that were previously reserved for restorative capacity into active service, thus providing more capacity for revenue without incremental costs!</p>
<p><strong>So what?</strong></p>
<p>This application shows that in some cases, real time network network analysis is required.  As a result, some data size and analytic compromises were made.  Faster computation options should allow more data to be used in the future.</p>
<p><strong>References</strong></p>
<p>Optimizing restoration capacity in the AT&amp;T network.  Ambs K., et. al.  <em>Interfaces</em> 30, 2000, p. 26-44</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/194/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=194&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/06/01/optimizing-restoration-capacity-in-att-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>
	</item>
		<item>
		<title>Outbreak detection in a water distribution network</title>
		<link>http://sonamine.wordpress.com/2009/05/20/outbreak-detection-in-a-water-distribution-network/</link>
		<comments>http://sonamine.wordpress.com/2009/05/20/outbreak-detection-in-a-water-distribution-network/#comments</comments>
		<pubDate>Wed, 20 May 2009 19:45:12 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Public safety]]></category>
		<category><![CDATA[cascades]]></category>
		<category><![CDATA[network structure]]></category>
		<category><![CDATA[optimized placement]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=188</guid>
		<description><![CDATA[In 2006, the Water Distribution Systems Analysis Symposium put out a challenge to identify the most cost effective way to detect contaminants in a given water system.  They put out two data sets, each complete with information about water flow, pressure, population affected.  The two datasets had 129 nodes (pipe junctions, distribution points) and 12527 [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=188&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In 2006, the Water Distribution Systems Analysis Symposium put out a challenge to identify the most cost effective way to detect contaminants in a given water system.  They put out two data sets, each complete with information about water flow, pressure, population affected.  The two datasets had 129 nodes (pipe junctions, distribution points) and 12527 nodes respectively.</p>
<p>The challenge was to identify where to place 20 sensors so that they would minimize 4 different variables in the event of specific contamination scenarios: expected time of detection, expected population affected prior to detection, expected volume of consumed contaminated water prior to detection, and detection likelihood.</p>
<p>Leskovec and team were able to use network analysis techniques to accomplish this task.   One key contribution they made in this paper was to improve the speed of the technique.  In their words, &#8220;we need(ed) to simulate 3.6 million contamination scenarios, each of which takes approximately 7 seconds&#8230;.We ran the simulation for a month on a cluster of about 40 machines&#8221;.  Without the improvements that resulted in a compression of the data and thus fitting the raw data into RAM, the simulations would have taken 1000x longer.   That&#8217;s not a typo&#8230; 1000x longer.</p>
<p><strong>So what?</strong></p>
<p>Some very useful network/graph techniques are going to be available for commercial use when better algorithms and distributed systems come together!</p>
<p><strong>References</strong></p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:0;width:1px;height:1px;">Z1 = expected time of detection,</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:0;width:1px;height:1px;">2 Z2 = expected population affected prior to detection,</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:0;width:1px;height:1px;">3 Z3 = expected volume of consumed contaminated water prior to detection,</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:0;width:1px;height:1px;">4 Z4 = detection likelihood,</div>
<p>Cost effective outbreak detection in networks.  Leskovec, J., et. al.  KDD&#8217;07 Aug 12-15, 2007.  San Jose, California, USA.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/188/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/188/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/188/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=188&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/05/20/outbreak-detection-in-a-water-distribution-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>
	</item>
		<item>
		<title>Community analytics using graphs</title>
		<link>http://sonamine.wordpress.com/2009/05/19/community-analytics-using-graphs/</link>
		<comments>http://sonamine.wordpress.com/2009/05/19/community-analytics-using-graphs/#comments</comments>
		<pubDate>Tue, 19 May 2009 17:07:56 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Social networks]]></category>
		<category><![CDATA[Support]]></category>
		<category><![CDATA[Centrality]]></category>
		<category><![CDATA[communities]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=167</guid>
		<description><![CDATA[Companies are deploying communities in droves to drive down support costs.  Folks like Microsoft, Intel and Best Buy all have communities, blogs and bulletin boards centered around product information and support.  Sonamine will embark on a project to figure out how community managers can further use network analytics to improve their communities.   The output [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=167&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Companies are deploying communities in droves to drive down support costs.  Folks like Microsoft, Intel and Best Buy all have communities, blogs and bulletin boards centered around product information and support.  Sonamine will embark on a project to figure out how community managers can further use network analytics to improve their communities.   The output will be whitepaper that will be shared with the participating community managers.</p>
<p>Any community managers or media strategists who want to get involved are welcome to contact me at nick@sonamine.com</p>
<p>Link to project page <a href="http://sonamine.wordpress.com/community-analytics/" target="_self">here</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/167/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/167/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/167/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=167&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/05/19/community-analytics-using-graphs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>
	</item>
		<item>
		<title>Predicting cancer aggressiveness using gene networks</title>
		<link>http://sonamine.wordpress.com/2009/05/17/predicting-cancer-aggressiveness-using-gene-networks/</link>
		<comments>http://sonamine.wordpress.com/2009/05/17/predicting-cancer-aggressiveness-using-gene-networks/#comments</comments>
		<pubDate>Sun, 17 May 2009 00:00:21 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Biology]]></category>
		<category><![CDATA[Health]]></category>
		<category><![CDATA[cancer]]></category>
		<category><![CDATA[gene network]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[predictive analysis]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=158</guid>
		<description><![CDATA[Cancer treatment has been improving at great strides.  One important facet used to decide on the course of treatment is how likely the cancer will spread aggressively.  Metastasis is the term used to describe the cancer spreading.  If doctors think a cancer will spread aggressively, then the recommended treatment will also be more aggressive.  More [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=158&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Cancer treatment has been improving at great strides.  One important facet used to decide on the course of treatment is how likely the cancer will spread aggressively.  Metastasis is the term used to describe the cancer spreading.  If doctors think a cancer will spread aggressively, then the recommended treatment will also be more aggressive.  More aggressive treatment tends to result in more side effects.  </p>
<p>The aggressiveness of cancers have been studied in wide scale protein marker studies, ie.  what proteins and genes can be used to predict the level of aggression.  So far, the gene and protein predictors have been &#8220;scalar&#8221; and &#8220;independent&#8221;.  This study uses a novel approach &#8211; look at network of interactions between various proteins, and use these networks to predict metastatis.  </p>
<p>So the method is fairly straightforward.  The dataset consisted of 8141 genes studied from 2 sets of breast cancer patients.  Some of these breast cancer patients developed metastatic cancer, some did not.  To find the subnetworks, they implemented a simple algorithm that used 2 elements &#8211; a scoring function for the subnetwork and a &#8220;greedy node addition&#8221; step.  They started with a random node, and added a neighboring protein node.  The scoring function was calculated to see if new node added any value to the scoring function.  By going through this whole process, they found 149 and 243 significant subnetworks in the 2 data sets respectively.  </p>
<p>Each subnetwork was then measured with a activity score that is different from the discriminant scoring function used to find them.  This activity score was then analyzed with logistic regression to see how well they predicted whether a particular patient would become metastatic.  </p>
<p>At a fixed sensitivity of 90%, the subnetwork markers achieved 70.1%  and 72.2%  accuracy, measured as the percentage of correct classifications, compared 62 and 63% with using only gene markers.  An AUC analysis also shows that these subnetworks do a better job of predicting whether a particular cancer patient has an aggressive cancer.</p>
<p><img class="aligncenter size-full wp-image-162" title="auc" src="http://sonamine.files.wordpress.com/2009/05/auc.png?w=500&#038;h=272" alt="auc" width="500" height="272" /></p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">At a fixed sensitivity of</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">90%, the subnetwork markers achieved 70.1% (van de Vijver</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">et al, 2002) and 72.2% (Wang et al, 2005) accuracy, measured</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">as the percentage of correct classifications using the technique</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">of five-fold cross-validation within each data set. This</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">accuracy compares favorably with those reported in the</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">original studies (van de Vijver et al, 2002; Wang et al, 2005)</div>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:230px;width:1px;height:1px;">(62 and 63%; see Supplementary Table S1).</div>
<p> </p>
<p><strong>So what?</strong></p>
<p>This study allows us now to start looking for groups of genes together that affect our health, not just individual genes.</p>
<p><strong>References</strong></p>
<p>Network based classification of breast cancer metastasis.  Chuang, H.Y. et al.  <em>Molecular Systems Biology</em> 3:140.  2007</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/158/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=158&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/05/17/predicting-cancer-aggressiveness-using-gene-networks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/05/auc.png" medium="image">
			<media:title type="html">auc</media:title>
		</media:content>
	</item>
		<item>
		<title>Finding hierarchy in football (and any other) schedules</title>
		<link>http://sonamine.wordpress.com/2009/05/15/finding-hierarchy-in-football-and-any-other-schedules/</link>
		<comments>http://sonamine.wordpress.com/2009/05/15/finding-hierarchy-in-football-and-any-other-schedules/#comments</comments>
		<pubDate>Fri, 15 May 2009 16:55:46 +0000</pubDate>
		<dc:creator>sonamine</dc:creator>
				<category><![CDATA[Organization]]></category>
		<category><![CDATA[Sports]]></category>
		<category><![CDATA[max flow]]></category>
		<category><![CDATA[network structure]]></category>
		<category><![CDATA[NFL]]></category>
		<category><![CDATA[sparse cut]]></category>

		<guid isPermaLink="false">http://sonamine.wordpress.com/?p=150</guid>
		<description><![CDATA[An interesting question:  If a martian were to look at the NFL match schedules, would it be able to correctly identify the NFC and AFC, with the east, west, north south divisions and the four teams in each of them?  Mann and team decided to use graph based techniques to tease this out.  The first [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=150&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>An interesting question:  If a martian were to look at the NFL match schedules, would it be able to correctly identify the NFC and AFC, with the east, west, north south divisions and the four teams in each of them? </p>
<p>Mann and team decided to use graph based techniques to tease this out.  The first step is to recognize that the game schedule is actually a graph in disguise.  Teams are nodes in the graph, when 2 teams play, they are connected by an edge.   </p>
<p>Next to identify hierarchies and clusters in the network, one assumes that the amount of &#8220;connectivity&#8221; within each cluster is much higher than outside the cluster.  So one way to separate out connectivity is to think in terms of flow of something, like water perhaps, through the network graph.  If a particular set of nodes (teams) are highly connected, then the flow through these nodes is very high.  That&#8217;s quite intuitive since there are more &#8220;pipes&#8221; among the nodes.</p>
<p>So the problem is really a graph problem.  You want to &#8220;cut&#8221; the graph into the smallest number of subgraphs while maintaining the highest flow in each of the subgraph and having the highest maximum concurrent flow.  The order of the cuts determine both the clusters and the hierarchical ordering of the subgraphs.</p>
<p>So they were able to correctly deduce the hierarchical structure from the team match schedule, for both NFL and NCAA.  The matrix below shows the 2 diagonal sections for NFC and AFC, with the further subdivisions in different shades of grey.  Neat.</p>
<p><img class="aligncenter size-full wp-image-153" title="games" src="http://sonamine.files.wordpress.com/2009/05/games.png?w=500&#038;h=348" alt="games" width="500" height="348" /></p>
<p><strong>So what?<br />
</strong>Our human social network also has a hierarchical structure organized along age.  We speak less to our parents, but more to our friends.  Some of these techniques may be used to identify parent child clusters from call or facebook social graphs!<strong> </strong></p>
<p><strong>References</strong></p>
<p>The use of sparsest cuts to reveal the hierarchical community structure of social networks.  Mann, C. F., Matula, D.W., Olinik, E.V.  <em>Social Networks</em>, 30 (2008) 223-234.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sonamine.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sonamine.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sonamine.wordpress.com/150/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sonamine.wordpress.com&amp;blog=7371213&amp;post=150&amp;subd=sonamine&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sonamine.wordpress.com/2009/05/15/finding-hierarchy-in-football-and-any-other-schedules/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/01593a36aa4a0ef58383fcdf289243eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Scalable graph analysis</media:title>
		</media:content>

		<media:content url="http://sonamine.files.wordpress.com/2009/05/games.png" medium="image">
			<media:title type="html">games</media:title>
		</media:content>
	</item>
	</channel>
</rss>
