Skip to main content

and
  1. No Access

    Chapter and Conference Paper

    The Integrated Delivery of Large-Scale Data Mining: The ACSys Data Mining Project

    Data Mining draws on many technologies to deliver novel and actionable discoveries from very large collections of data. The Australian Government’s Cooperative Research Centre for Advanced Computational System...

    Graham Williams, Irfan Altas, Sergey Bakin in Large-Scale Parallel Data Mining (2000)

  2. No Access

    Chapter and Conference Paper

    Estimating Episodes of Care Using Linked Medical Claims Data

    Australia has extensive administrative health data collected by Commonwealth and state agencies. Using a unique cleaned and linked administrative health dataset we address the problem of empirically defining e...

    Graham Williams, Rohan Baxter, Chris Kelman in AI 2002: Advances in Artificial Intelligen… (2002)

  3. No Access

    Chapter and Conference Paper

    Association Rule Discovery with Unbalanced Class Distributions

    There are many methods for finding association rules in very large data. However it is well known that most general association rule discovery methods find too many rules, many of which are uninteresting rules...

    Lifang Gu, Jiuyong Li, Hongxing He in AI 2003: Advances in Artificial Intelligen… (2003)

  4. No Access

    Chapter and Conference Paper

    Conceptual Mining of Large Administrative Health Data

    Health databases are characterised by large number of records, large number of attributes and mild density. This encourages data miners to use methodologies that are more sensitive to health undustry specifics...

    Tatiana Semenova, Markus Hegland in Advances in Knowledge Discovery and Data M… (2004)

  5. No Access

    Chapter and Conference Paper

    Exploring Possible Adverse Drug Reactions by Clustering Event Sequences

    Historically the identification of adverse drug reactions relies on manual processes whereby doctors and hospitals report incidences to a central agency. In this paper we suggest a data mining approach using a...

    Hongxing He, Graham Williams, Jie Chen in Data Warehousing and Knowledge Discovery (2004)

  6. No Access

    Chapter and Conference Paper

    Temporal Sequence Associations for Rare Events

    In many real world applications, systematic analysis of rare events, such as credit card frauds and adverse drug reactions, is very important. Their low occurrence rate in large databases often makes it diffic...

    Jie Chen, Hongxing He, Graham Williams in Advances in Knowledge Discovery and Data M… (2004)

  7. No Access

    Article

    On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms

    Outlier detection is a fundamental issue in data mining, specifically in fraud detection, network intrusion detection, network monitoring, etc. SmartSifter is an outlier detection engine addressing this proble...

    Kenji Yamanishi, Jun-ichi Takeuchi, Graham Williams in Data Mining and Knowledge Discovery (2004)

  8. No Access

    Chapter and Conference Paper

    Representing Association Classification Rules Mined from Health Data

    An association classification algorithm has been developed to explore adverse drug reactions in a large medical transaction dataset with unbalanced classes. Rules discovered can be used to alert medical practi...

    Jie Chen, Hongxing He, Jiuyong Li in Knowledge-Based Intelligent Information an… (2005)

  9. No Access

    Chapter and Conference Paper

    Neighborhood Density Method for Selecting Initial Cluster Centers in K-Means Clustering

    This paper presents a new method for effectively selecting initial cluster centers in k-means clustering. This method identifies the high density neighborhoods from the data first and then selects the central poi...

    Yunming Ye, Joshua Zhexue Huang in Advances in Knowledge Discovery and Data M… (2006)

  10. No Access

    Chapter

    Identifying Risk Groups Associated with Colorectal Cancer

    In this paper, we explore data mining techniques for the task of identifying and describing risk groups for colorectal cancer (CRC) from population based administrative health data. Association rule discovery,...

    Jie Chen, Hongxing He, Huidong **, Damien McAullay, Graham Williams in Data Mining (2006)

  11. No Access

    Chapter and Conference Paper

    A Survey of Open Source Data Mining Systems

    Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterpris...

    **aojun Chen, Yunming Ye, Graham Williams in Emerging Technologies in Knowledge Discove… (2007)

  12. No Access

    Book and Conference Proceedings

    New Frontiers in Applied Data Mining

    PAKDD 2009 International Workshops, Bangkok, Thailand, April 27-30, 2009. Revised Selected Papers

    Thanaruk Theeramunkong in Lecture Notes in Computer Science (2010)

  13. No Access

    Chapter and Conference Paper

    Hybrid Random Forests: Advantages of Mixed Trees in Classifying Text Data

    Random forests are a popular classification method based on an ensemble of a single type of decision tree. In the literature, there are many different types of decision tree algorithms, including C4.5, CART an...

    Baoxun Xu, Joshua Zhexue Huang in Advances in Knowledge Discovery and Data M… (2012)

  14. No Access

    Chapter and Conference Paper

    Ensemble Clustering of High Dimensional Data with FastMap Projection

    In this paper, we propose an ensemble clustering method for high dimensional data which uses FastMap projection to generate subspace component data sets. In comparison with popular random sampling and random p...

    Imran Khan, Joshua Zhexue Huang in Trends and Applications in Knowledge Disco… (2014)

  15. No Access

    Chapter and Conference Paper

    Extensions to Quantile Regression Forests for Very High-Dimensional Data

    This paper describes new extensions to the state-of-the-art regression random forests Quantile Regression Forests (QRF) for applications to high-dimensional data with thousands of features. We propose a new subsp...

    Nguyen Thanh Tung, Joshua Zhexue Huang in Advances in Knowledge Discovery and Data M… (2014)

  16. No Access

    Chapter and Conference Paper

    Stratified Over-Sampling Bagging Method for Random Forests on Imbalanced Data

    Imbalanced data presents a big challenge to random forests (RF). Over-sampling is a commonly used sampling method for imbalanced data, which increases the number of instances of minority class to balance the c...

    He Zhao, **aojun Chen, Tung Nguyen in Intelligence and Security Informatics (2016)

  17. No Access

    Chapter and Conference Paper

    TRIC: A Triples Corrupter for Knowledge Graphs

    We study the problem of corrupting triples in Knowledge Graphs (KG) for the purpose of assisting anomaly detection and error detection techniques developed for KG quality enhancement. Our goal is to provide us...

    Asara Senaratne, Pouya Ghiasnezhad Omran in The Semantic Web: ESWC 2023 Satellite Even… (2023)