Publications – Knowledge Discovery Lab

Marc Maier, Brian Taylor, Huseyin Oktay, David Jensen

Learning causal models of relational domains Proceedings Article

In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010, 2010.

Abstract | Links | BibTeX

Huseyin Oktay, Brian Taylor, David Jensen

Causal discovery in social media using quasi-experimental designs Proceedings Article

In: Proceedings of the 3rd Workshop on Social Network Mining and Analysis, SNAKDD, pp. 1–9, 2010.

Abstract | Links | BibTeX

Brian Delaney, Andrew Fast, W Campbell, C Weinstein, David Jensen

The application of statistical relational learning to a database of criminal and terrorist activity Proceedings Article

In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 409–417, Society for Industrial and Applied Mathematics 2010.

Abstract | Links | BibTeX

Matthew Rattigan, David Jensen

Leveraging d-separation for relational data sets Proceedings Article

In: ICDM 2010, The 10th IEEE International Conference on Data Mining, pp. 989–994, IEEE 2010.

Abstract | Links | BibTeX

Michael Hay, Chao Li, Gerome Miklau, David Jensen

Accurate Estimation of the Degree Distribution of Private Networks Proceedings Article

In: ICDM 2009, The Ninth IEEE International Conference on Data Mining, Miami, Florida, USA, 6-9 December 2009, pp. 169–178, IEEE Computer Society, 2009.

Abstract | Links | BibTeX

Andrew Fast, David Jensen

Constraint relaxation for learning the structure of Bayesian networks Technical Report

Tech Report 09-18, University of Massachusetts Amherst, Computer Science~… 2009.

Abstract | Links | BibTeX

Michael Hay, Gerome Miklau, David Jensen, Don Towsley, Philipp Weis

Resisting structural re-identification in anonymized social networks Journal Article

In: Proceedings of the VLDB Endowment, vol. 1, no. 1, pp. 102–114, 2008.

Abstract | Links | BibTeX

Ozgur Simsek, David Jensen

Navigating networks by using homophily and degree Journal Article

In: Proceedings of the National Academy of Sciences, vol. 105, no. 35, pp. 12758–12762, 2008.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen

A bias/variance decomposition for models using collective inference Journal Article

In: Machine Learning, vol. 73, no. 1, pp. 87–106, 2008.

Abstract | Links | BibTeX

Amy McGovern, David Jensen

Optimistic pruning for multiple instance learning Journal Article

In: Pattern recognition letters, vol. 29, no. 9, pp. 1252–1260, 2008.

Abstract | Links | BibTeX

David Jensen, Andrew Fast, Brian Taylor, Marc Maier

Automatic identification of quasi-experimental designs for discovering causal knowledge Proceedings Article

In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008, pp. 372–380, 2008.

Abstract | Links | BibTeX

@inproceedings{jensen2008automatic,

title = {Automatic identification of quasi-experimental designs for discovering causal knowledge},

author = {David Jensen and Andrew Fast and Brian Taylor and Marc Maier},

url = {https://doi.org/10.1145/1401890.1401938},

year  = {2008},

date = {2008-01-01},

booktitle = {Proceedings of the 14th ACM SIGKDD International Conference on 

 Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 

 24-27, 2008},

pages = {372--380},

abstract = {Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large data-bases. Quasi-experimental designs (QEDs) exploit fortuitous circumstances in non-experimental data to identify situations (sometimes called "natural experiments") that provide the equivalent of experimental control and randomization. QEDs allow researchers in domains as diverse as sociology, medicine, and marketing to draw reliable inferences about causal dependencies from non-experimental data. Unfortunately, identifying and exploiting QEDs has remained a painstaking manual activity, requiring researchers to scour available databases and apply substantial knowledge of statistics. However, recent advances in the expressiveness of databases, and increases in their size and complexity, provide the necessary conditions to automatically identify QEDs. In this paper, we describe the first system to discover knowledge by applying quasi-experimental designs that were identified automatically. We demonstrate that QEDs can be identified in a traditional database schema and that such identification requires only a small number of extensions to that schema, knowledge about quasi-experimental design encoded in first-order logic, and a theorem-proving engine. We describe several key innovations necessary to enable this system, including methods for automatically constructing appropriate experimental units and for creating aggregate variables on those units. We show that applying the resulting designs can identify important causal dependencies in real domains, and we provide examples from academic publishing, movie making and marketing, and peer-production systems. Finally, we discuss the integration of QEDs with other approaches to causal discovery, including joint modeling and directed experimentation.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Andrew Fast, David Jensen

Why stacked models perform effective collective classification Proceedings Article

In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15-19, 2008, Pisa, Italy, pp. 785–790, IEEE Computer Society, 2008.

Abstract | Links | BibTeX

David Jensen, Andrew Fast, Brian Taylor, Marc Maier

Automatic Identification of Quasi-Experimental Designs for Discovering Causal Knowledge Proceedings Article

In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 372–380, Association for Computing Machinery, Las Vegas, Nevada, USA, 2008, ISBN: 9781605581934.

Abstract | Links | BibTeX

@inproceedings{10.1145/1401890.1401938,

title = {Automatic Identification of Quasi-Experimental Designs for Discovering Causal Knowledge},

author = {David Jensen and Andrew Fast and Brian Taylor and Marc Maier},

url = {https://doi.org/10.1145/1401890.1401938},

doi = {10.1145/1401890.1401938},

isbn = {9781605581934},

year  = {2008},

date = {2008-01-01},

booktitle = {Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},

pages = {372–380},

publisher = {Association for Computing Machinery},

address = {Las Vegas, Nevada, USA},

series = {KDD '08},

abstract = {Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large data-bases. Quasi-experimental designs (QEDs) exploit fortuitous circumstances in non-experimental data to identify situations (sometimes called "natural experiments") that provide the equivalent of experimental control and randomization. QEDs allow researchers in domains as diverse as sociology, medicine, and marketing to draw reliable inferences about causal dependencies from non-experimental data. Unfortunately, identifying and exploiting QEDs has remained a painstaking manual activity, requiring researchers to scour available databases and apply substantial knowledge of statistics. However, recent advances in the expressiveness of databases, and increases in their size and complexity, provide the necessary conditions to automatically identify QEDs. In this paper, we describe the first system to discover knowledge by applying quasi-experimental designs that were identified automatically. We demonstrate that QEDs can be identified in a traditional database schema and that such identification requires only a small number of extensions to that schema, knowledge about quasi-experimental design encoded in first-order logic, and a theorem-proving engine. We describe several key innovations necessary to enable this system, including methods for automatically constructing appropriate experimental units and for creating aggregate variables on those units. We show that applying the resulting designs can identify important causal dependencies in real domains, and we provide examples from academic publishing, movie making and marketing, and peer-production systems. Finally, we discuss the integration of QEDs with other approaches to causal discovery, including joint modeling and directed experimentation.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Andrew Fast, Michael Hay, David Jensen

Improving accuracy of constraint-based structure learning Technical Report

Technical report 08-48, University of Massachusetts Amherst, Computer~… 2008.

Abstract | Links | BibTeX

@techreport{fast2008improving,

title = {Improving accuracy of constraint-based structure learning},

author = {Andrew Fast and Michael Hay and David Jensen},

url = {https://www.researchgate.net/profile/David-Jensen-10/publication/228854891_Improving_Accuracy_of_Constraint-Based_Structure_Learning/links/09e41510892d741c18000000/Improving-Accuracy-of-Constraint-Based-Structure-Learning.pdf},

year  = {2008},

date = {2008-01-01},

institution = {Technical report 08-48, University of Massachusetts Amherst, Computer~…},

abstract = {Hybrid algorithms for learning the structure of Bayesian networks combine techniques from both the constraintbased and search-and-score paradigms of structure learning. One class of hybrid approaches uses a constraintbased algorithm to learn an undirected skeleton identifying edges that should appear in the final network. This skeleton is used to constrain the model space considered by a search-and-score algorithm to orient the edges and produce a final model structure. At small sample sizes, the performance of models learned using this hybrid approach do not achieve likelihood as high as models learned by unconstrained search. Low performance is a result of errors made by the skeleton identification algorithm, particularly false negative errors, which lead to an over-constrained search space. These errors are often attributed to “noisy” hypothesis tests that are run during skeleton identification. However, at least three specific sources of error have been identified in the literature: unsuitable hypothesis tests, lowpower hypothesis tests, and unexplained d-separation. No previous work has considered these sources of error in combination. We determine the relative importance of each source individually and in combination. We identify that low-power tests are the primary source of false negative errors, and show that these errors can be corrected by a novel application of statistical power analysis. The result is a new hybrid algorithm for learning the structure of Bayesian networks which produces models with equivalent likelihood to models produced by unconstrained greedy search, using only a fraction of the time.},

keywords = {},

pubstate = {published},

tppubtype = {techreport}

}

Close

Jennifer Neville, David Jensen

Relational Dependency Networks Journal Article

In: J. Mach. Learn. Res., vol. 8, pp. 653–692, 2007.

Abstract | Links | BibTeX

Michael Hay, Gerome Miklau, David Jensen, Philipp Weis, Siddharth Srivastava

Anonymizing social networks Journal Article

In: Computer science department faculty publication series, pp. 180, 2007.

Abstract | Links | BibTeX

Michael Hay, Andrew Fast, David Jensen

Understanding the effects of search constraints on structure learning Journal Article

In: U Mass. Amherst CS, Tech. Rep, pp. 07–21, 2007.

Abstract | Links | BibTeX

Matthew Rattigan, Marc Maier, David Jensen

Graph clustering with network structure indices Proceedings Article

In: Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20-24, 2007, pp. 783–790, ACM, 2007.

Abstract | Links | BibTeX

Trevor Strohman, W. Bruce Croft, David Jensen

Recommending citations for academic papers Proceedings Article

In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007, pp. 705–706, ACM, 2007.

Abstract | Links | BibTeX

Lisa Friedland, David Jensen

Finding tribes: identifying close-knit individuals from employment patterns Proceedings Article

In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007, pp. 290–299, ACM, 2007.

Abstract | Links | BibTeX

Matthew Rattigan, Marc Maier, David Jensen, Bin Wu, Xin Pei, Jianbin Tan, Yi Wang

Exploiting Network Structure for Active Inference in Collective Classification Proceedings Article

In: Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), October 28-31, 2007, Omaha, Nebraska, USA, pp. 429–434, IEEE Computer Society, 2007.

Abstract | Links | BibTeX

Andrew Fast, Lisa Friedland, Marc Maier, Brian Taylor, David Jensen, Henry G. Goldberg, John Komoroske

Relational data pre-processing techniques for improved securities fraud detection Proceedings Article

In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007, pp. 941–949, ACM, 2007.

Abstract | Links | BibTeX

@inproceedings{DBLP:conf/kdd/FastFMTJGK07,

title = {Relational data pre-processing techniques for improved securities 

 fraud detection},

author = {Andrew Fast and Lisa Friedland and Marc Maier and Brian Taylor and David Jensen and Henry G. Goldberg and John Komoroske},

url = {https://doi.org/10.1145/1281192.1281293},

doi = {10.1145/1281192.1281293},

year  = {2007},

date = {2007-01-01},

booktitle = {Proceedings of the 13th ACM SIGKDD International Conference on 

 Knowledge Discovery and Data Mining, San Jose, California, USA, August 

 12-15, 2007},

pages = {941--949},

publisher = {ACM},

abstract = {Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarely structured appropriately for knowledge discovery, and they often contain variables whose meanings change across different subsets of the data. We describe how these challenges were addressed in a collaborative analysis project undertaken by the University of Massachusetts Amherst and the National Association of Securities Dealers(NASD). We describe several methods for data pre-processing that we applied to transform a large, dynamic, and relational dataset describing nearly the entirety of the U.S. securities industry, and we show how these methods made the dataset suitable for learning statistical relational models. To better utilize social structure, we first applied known consolidation and link formation techniques to associate individuals with branch office locations. In addition, we developed an innovative technique to infer professional associations by exploiting dynamic employment histories. Finally, we applied normalization techniques to create a suitable class label that adjusts for spatial, temporal, and other heterogeneity within the data. We show how these pre-processing techniques combine to provide the necessary foundation for learning high-performing statistical models of fraudulent activity.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

David Jensen

Beyond Prediction: Directions for Probabilistic and Relational Learning Proceedings Article

In: Inductive Logic Programming, 17th International Conference, ILP 2007, Corvallis, OR, USA, June 19-21, 2007, Revised Selected Papers, pp. 4–21, Springer, 2007.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen

Bias/Variance Analysis for Relational Domains Proceedings Article

In: Inductive Logic Programming, 17th International Conference, ILP 2007, Corvallis, OR, USA, June 19-21, 2007, Revised Selected Papers, pp. 27–28, Springer, 2007.

Abstract | Links | BibTeX

Hendrik Blockeel, David Jensen, Stefan Kramer

Introduction to the special issue on multi-relational data mining and statistical relational learning Journal Article

In: Mach. Learn., vol. 62, no. 1-2, pp. 3–5, 2006.

Links | BibTeX

Aaron M Ellison, Leon J Osterweil, Lori Clarke, Julian L Hadley, Alexander Wise, Emery Boose, David R Foster, Allen Hanson, David Jensen, Paul Kuzeja, others

Analytic webs support the synthesis of ecological data sets Journal Article

In: Ecology, vol. 87, no. 6, pp. 1345–1358, 2006.

Abstract | Links | BibTeX

@article{ellison2006analytic,

title = {Analytic webs support the synthesis of ecological data sets},

author = {Aaron M Ellison and Leon J Osterweil and Lori Clarke and Julian L Hadley and Alexander Wise and Emery Boose and David R Foster and Allen Hanson and David Jensen and Paul Kuzeja and others},

url = {https://esajournals.onlinelibrary.wiley.com/doi/pdfdirect/10.1890/0012-9658%282006%2987%5B1345%3AAWSTSO%5D2.0.CO%3B2},

year  = {2006},

date = {2006-01-01},

journal = {Ecology},

volume = {87},

number = {6},

pages = {1345--1358},

publisher = {Wiley Online Library},

abstract = {A wide variety of data sets produced by individual investigators are now synthesized to address ecological questions that span a range of spatial and temporal scales. It is important to facilitate such syntheses so that "consumers" of data sets can be confident that both input data sets and synthetic products are reliable. Necessary documentation to ensure the reliability and validation of data sets includes both familiar descriptive metadata and formal documentation of the scientific processes used (i.e., process metadata) to produce usable data sets from collections of raw data. Such documentation is complex and difficult to construct, so it is important to help "producers" create reliable data sets and to facilitate their creation of required metadata. We describe a formal representation, an "analytic web," that aids both producers and consumers of data sets by providing complete and precise definitions of scientific processes used to process raw and derived data sets. The formalisms used to define analytic webs are adaptations of those used in software engineering, and they provide a novel and effective support system for both the synthesis and the validation of ecological data sets. We illustrate the utility of an analytic web as an aid to producing synthetic data sets through a worked example: the synthesis of long-term measurements of whole-ecosystem carbon exchange. Analytic webs are also useful validation aids for consumers because they support the concurrent construction of a complete, Internet-accessible audit trail of the analytic processes used in the synthesis of the data sets. Finally we describe our early efforts to evaluate these ideas through the use of a prototype software tool, SciWalker. We indicate how this tool has been used to create analytic webs tailored to specific data-set synthesis and validation activities, and suggest extensions to it that will support additional forms of validation. The process metadata created by SciWalker is readily adapted for inclusion in Ecological Metadata Language (EML) files.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close

A wide variety of data sets produced by individual investigators are now synthesized to address ecological questions that span a range of spatial and temporal scales. It is important to facilitate such syntheses so that "consumers" of data sets can be confident that both input data sets and synthetic products are reliable. Necessary documentation to ensure the reliability and validation of data sets includes both familiar descriptive metadata and formal documentation of the scientific processes used (i.e., process metadata) to produce usable data sets from collections of raw data. Such documentation is complex and difficult to construct, so it is important to help "producers" create reliable data sets and to facilitate their creation of required metadata. We describe a formal representation, an "analytic web," that aids both producers and consumers of data sets by providing complete and precise definitions of scientific processes used to process raw and derived data sets. The formalisms used to define analytic webs are adaptations of those used in software engineering, and they provide a novel and effective support system for both the synthesis and the validation of ecological data sets. We illustrate the utility of an analytic web as an aid to producing synthetic data sets through a worked example: the synthesis of long-term measurements of whole-ecosystem carbon exchange. Analytic webs are also useful validation aids for consumers because they support the concurrent construction of a complete, Internet-accessible audit trail of the analytic processes used in the synthesis of the data sets. Finally we describe our early efforts to evaluate these ideas through the use of a prototype software tool, SciWalker. We indicate how this tool has been used to create analytic webs tailored to specific data-set synthesis and validation activities, and suggest extensions to it that will support additional forms of validation. The process metadata created by SciWalker is readily adapted for inclusion in Ecological Metadata Language (EML) files.

Close

John Burgess, Brian Gallagher, David Jensen, Brian Neil Levine

MaxProp: Routing for Vehicle-Based Disruption-Tolerant Networks Proceedings Article

In: INFOCOM 2006. 25th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 23-29 April 2006, Barcelona, Catalunya, Spain, IEEE, 2006.

Abstract | Links | BibTeX

Matthew Rattigan, Marc Maier, David Jensen

Using structure indices for efficient approximation of network properties Proceedings Article

In: Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006, pp. 357–366, ACM, 2006.

Abstract | Links | BibTeX

Chirag Shah, W. Bruce Croft, David Jensen

Representing documents with named entities for story link detection (SLD) Proceedings Article

In: Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, November 6-11, 2006, pp. 868–869, ACM, 2006.

Abstract | Links | BibTeX

Andrew Fast, David Jensen

The NFL Coaching Network: Analysis of the Social Network among Professional Football Coaches Proceedings Article

In: Capturing and Using Patterns for Evidence Detection, Papers from the 2006 AAAI Fall Symposium, Washington, DC, USA, October 13-15, 2006, pp. 112–119, AAAI Press, 2006.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen

Bias/variance analysis for network data Proceedings Article

In: Proceedings of the Workshop on Statistical Relational Learning, 23rd International Conference on Machine Learning, 2006.

Abstract | Links | BibTeX

Matthew Rattigan, David Jensen

The case for anomalous link discovery Journal Article

In: SIGKDD Explor., vol. 7, no. 2, pp. 41–47, 2005.

Abstract | Links | BibTeX

Brian Gallagher, David Jensen, Brian Neil Levine, others

Explaining routing performance in disruption tolerant networks Journal Article

In: University of Massachusetts Amherst, Technical Report, 2005.

Abstract | Links | BibTeX

George Dean Bissias, Marc Liberatore, David Jensen, Brian Neil Levine

Privacy Vulnerabilities in Encrypted HTTP Streams Proceedings Article

In: Privacy Enhancing Technologies, 5th International Workshop, PET 2005, Cavtat, Croatia, May 30-June 1, 2005, Revised Selected Papers, pp. 1–11, Springer, 2005.

Abstract | Links | BibTeX

@inproceedings{DBLP:conf/pet/BissiasLJL05,

title = {Privacy Vulnerabilities in Encrypted HTTP Streams},

author = {George Dean Bissias and Marc Liberatore and David Jensen and Brian Neil Levine},

url = {https://doi.org/10.1007/11767831_1},

doi = {10.1007/11767831_1},

year  = {2005},

date = {2005-01-01},

booktitle = {Privacy Enhancing Technologies, 5th International Workshop, PET 

 2005, Cavtat, Croatia, May 30-June 1, 2005, Revised Selected Papers},

volume = {3856},

pages = {1--11},

publisher = {Springer},

series = {Lecture Notes in Computer Science},

abstract = {Encrypting traffic does not prevent an attacker from performing some types of traffic analysis. We present a straightforward traffic analysis attack against encrypted HTTP streams that is surprisingly effective in identifying the source of the traffic. An attacker starts by creating a profile of the statistical characteristics of web requests from interesting sites, including distributions of packet sizes and inter-arrival times. Later, candidate encrypted streams are compared against these profiles. In our evaluations using real traffic, we find that many web sites are subject to this attack. With a training period of 24 hours and a 1 hour delay afterwards, the attack achieves only 23% accuracy. However, an attacker can easily pre-determine which of trained sites are easily identifiable. Accordingly, against 25 such sites, the attack achieves 40% accuracy; with three guesses, the attack achieves 100% accuracy for our data. Longer delays after training decrease accuracy, but not substantially. We also propose some countermeasures and improvements to our current method. Previous work analyzed SSL traffic to a proxy, taking advantage of a known flaw in SSL that reveals the length of each web object. In contrast, we exploit the statistical characteristics of web streams that are encrypted as a single flow, which is the case with WEP/WPA, IPsec, and SSH tunnels.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Jennifer Neville, David Jensen

Leveraging Relational Autocorrelation with Latent Group Models Proceedings Article

In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 27-30 November 2005, Houston, Texas, USA, pp. 322–329, IEEE Computer Society, 2005.

Abstract | Links | BibTeX

Andrew Fast, David Jensen, Brian Neil Levine

Creating social networks to improve peer-to-peer networking Proceedings Article

In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, August 21-24, 2005, pp. 568–573, ACM, 2005.

Abstract | Links | BibTeX

Stephen Hart, Roderic A. Grupen, David Jensen

A Relational Representation for Procedural Task Knowledge Proceedings Article

In: Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA, pp. 1280–1285, AAAI Press / The MIT Press, 2005.

Abstract | Links | BibTeX

Matthew Rattigan, David Jensen

The case for anomalous link detection Proceedings Article

In: Proceedings of the 4th international workshop on multi-relational mining, pp. 69–74, 2005.

Abstract | Links | BibTeX

Ozgur Simsek, David Jensen

A probabilistic framework for decentralized search in networks Proceedings Article

In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2005.

BibTeX

Ozgur Simsek, David Jensen

Decentralized Search in Networks Using Homophily and Degree Disparity Proceedings Article

In: IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30 - August 5, 2005, pp. 304–310, Professional Book Center, 2005.

Abstract | Links | BibTeX

George Dean Bissias, Marc Liberatore, David Jensen, Brian Neil Levine

Privacy vulnerabilities in encrypted HTTP streams Proceedings Article

In: International Workshop on Privacy Enhancing Technologies, pp. 1–11, Springer 2005.

Abstract | Links | BibTeX

David Jensen, Jennifer Neville, Brian Gallagher

Why collective inference improves relational classification Proceedings Article

In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August 22-25, 2004, pp. 593–598, ACM, 2004.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen

Dependency Networks for Relational Data Proceedings Article

In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK, pp. 170–177, IEEE Computer Society, 2004.

Abstract | Links | BibTeX

Jennifer Neville, Micah Adler, David Jensen

Spectral clustering with links and attributes Technical Report

MASSACHUSETTS UNIV AMHERST DEPT OF COMPUTER SCIENCE 2004.

Abstract | Links | BibTeX

Jennifer Neville, Ozgur Simsek, David Jensen

Autocorrelation and relational learning: Challenges and opportunities Technical Report

MASSACHUSETTS UNIV AMHERST DEPT OF COMPUTER SCIENCE 2004.

Abstract | Links | BibTeX

Andrew McCallum, David Jensen

A note on the unification of information extraction and data mining using conditional-probability, relational models Journal Article

In: Computer Science Department Faculty Publication Series, pp. 42, 2003.

Abstract | Links | BibTeX

Amy McGovern, Lisa Friedland, Michael Hay, Brian Gallagher, Andrew Fast, Jennifer Neville, David Jensen

Exploiting relational structure to understand publication patterns in high-energy physics Journal Article

In: SIGKDD Explor., vol. 5, no. 2, pp. 165–172, 2003.

Abstract | Links | BibTeX

David Jensen, Jennifer Neville

Data mining in social networks Book

na, 2003.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen, Lisa Friedland, Michael Hay

Learning relational probability trees Proceedings Article

In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24 - 27, 2003, pp. 625–630, ACM, 2003.

Abstract | Links | BibTeX

Jennifer Neville, David Jensen

Collective classification with relational dependency networks Proceedings Article

In: Workshop on Multi-Relational Data Mining (MRDM-2003), pp. 77, 2003.

Abstract | Links | BibTeX

Search Google Appliance

2010

2009

2008

2007

2006

2005

2004

2003