Home > Scaling Up Machine Learning

Scaling Up Machine Learning


Scaling Up Machine Learning 

Parallel and Distributed Approaches 

Ron Bekkerman, LinkedIn

Misha Bilenko, MSR

John Langford, Y!R->MSR 

Presented by Tom 

http://hunch.net/~large_scale_survey 

1


Outline 

  • Big DATA
  • Crowdsourcing Labeled Data
  • Parallelization: Platform Choices
    • Example on k-means clustering
 

2


The book 

  • Cambridge Uni Press
  • Due in November 2011
  • 21 chapters
  • Covering
    • Platforms
    • Algorithms
    • Learning setups
    • Applications
 

3


10 

2 

Chapter contributors 

3 

4 

5 

6 

7 

8 

9 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

4


Previous books 

1998 

2000 

2000 

5


Data hypergrowth: an example 

  • Reuters-21578: about 10K docs (ModApte)
  • RCV1: about 807K docs 
  • LinkedIn job title data: about 100M docs 
 

Bekkerman et al, SIGIR 2001 

Bekkerman & Scholz, CIKM 2008 

Bekkerman & Gavish, KDD 2011 

6


New age of big data 

  • The world has gone mobile
    • 5 billion cellphones produce daily data
  • Social networks have gone online
    • Twitter produces 200M tweets a day
  • Crowdsourcing is the reality
    • Labeling of 100,000+ data instances is doable
      • Within a week 
 

7


Size matters 

  • One thousand data instances
  • One million data instances
  • One billion data instances
  • One trillion data instances
 

Those are not different numbers,        those are different mindsets  

8


One thousand data instances 

  • Will process it manually within a day (a week?)
    • No need in an automatic approach
  • We shouldn’t publish main results on datasets of such size 
 

9


One million data instances 

  • Currently, the most active zone
  • Can be crowdsourced
  • Can be processed by a quadratic algorithm
    • Once parallelized
  • 1M data collection cannot be too diverse
    • But can be too homogenous
  • Preprocessing / data probing is crucial
 
 

10


Big dataset cannot be too sparse 

  • 1M data instances cannot belong to 1M classes
    • Simply because it’s not practical to have 1M classes 
  • Here’s a statistical experiment, in text domain:
    • 1M documents
    • Each document is 100 words long
    • Randomly sampled from a unigram language model
      • No stopwords
    • 245M pairs have word overlap of 10% or more
  • Real-world datasets are denser than random
 

11


Can big datasets be too dense? 

3746554337.jpg 

8374565642.jpg 

2648697083.jpg 

7264545727.jpg 

6255434389.jpg 

5039287651.jpg 

3045938173.jpg 

4596867462.jpg 

8871536482.jpg 

2037582194.jpg 

12


One billion data instances 

  • Web-scale
  • Guaranteed to contain data in different formats
    • ASCII text, pictures, javascript code, PDF documents…
  • Guaranteed to contain (near) duplicates
  • Likely to be badly preprocessed 
  • Storage is an issue
 

13


One trillion data instances 

  • Beyond the reach of the modern technology
  • Peer-to-peer paradigm is (arguably) the only way to process the data
  • Data privacy / inconsistency / skewness issues
    • Can’t be kept in one location
    • Is intrinsically hard to sample
 

14


A solution to data privacy problem  

  • n machines with n private datasets
    • All datasets intersect
    • The intersection is shared
  • Each machine learns a separate model
  • Models get consistent over the data intersection
 

Xiang et al, Chapter 16 

D1 

D2 

D3 

D4 

D5 

D6 

  • Check out Chapter 16 to see this approach applied in a recommender system!
 

15


So what model will we learn? 

  • Supervised model?
  • Unsupervised model?
  • Semi-supervised model?
  • Obviously, depending on the application  
    • But also on availability of labeled data
    • And its trustworthiness!
 

16


Size of training data 

  • Say you have 1K labeled and 1M unlabeled examples
    • Labeled/unlabeled ratio: 0.1%
    • Is 1K enough to train a supervised model?
  • Now you have 1M labeled and 1B unlabeled examples
    • Labeled/unlabeled ratio: 0.1%
    • Is 1M enough to train a supervised model?
 

17


Skewness of training data  

  • Usually, training data comes from users
  • Explicit user feedback might be misleading
    • Feedback providers may have various incentives
  • Learning from implicit feedback is a better idea
    • E.g. clicking on Web search results
  • In large-scale setups, skewness of training data is hard to detect 
 

18


Real-world example 

  • Goal: find high-quality professionals on LinkedIn
  • Idea: use recommendation data to train a model
    • Whoever has recommendations is a positive example
    • Is it a good idea?  
       
       
       
       
       
 

19


Not enough (clean) training data? 

  • Use existing labels as a guidance rather than a directive
    • In a semi-supervised clustering framework
  • Or label more data! 
    • With a little help from the crowd
 

20


Semi-supervised clustering 

  • Cluster unlabeled data D while taking labeled data D* into account
  • Construct clustering      while maximizing Mutual Information                
    • And keeping the number of clusters k constant
    •       is defined naturally over classes in D*
  • Results better than those of classification
 

Bekkerman et al, ECML 2006 

21


Semi-supervised clustering (details) 
 
 

  • Define an empirical joint distribution P(D, D*)
    • P(d, d*) is a normalized similarity between d and d*
  • Define the joint between clusterings
    • Where
  •             and              are marginals
 

22


Crowdsourcing labeled data 

  • Crowdsourcing is a tough business 
    • People are not machines
  • Any worker who can game the system       will game the system
  • Validation framework + qualification tests are a must
  • Labeling a lot of data can be fairly expensive
 

23


How to label 1M instances 

  • Budget a month of work + about $50,000
 

24


How to label 1M instances 

  • Hire a data annotation contractor in your town
    • Presumably someone you know well enough
 

25


How to label 1M instances 

  • Offer the guy $10,000 for one month of work
 

26


How to label 1M instances 

  • Construct a qualification test for your job
  • Hire 100 workers who pass it
    • Keep their worker IDs
 

27


How to label 1M instances 

  • Explain to them the task, make sure they get it
 

28


How to label 1M instances 

  • Offer them 4¢ per data instance if they do it right
 

29


How to label 1M instances 

  • Your contractor will label 500 data instances a day
  • This data will be used to validate worker results
 

500 

30


How to label 1M instances 

  • You’ll need to spot-check the results
 

31


How to label 1M instances 

  • You’ll need to spot-check the results
 
 

32


How to label 1M instances 

  • Each worker gets a daily task of 1000 data instances
 
 
 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

33


How to label 1M instances 

  • Some of which are already labeled by the contractor
 
 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

34


How to label 1M instances 

  • Check every worker’s result on that validation set
 
 

35


How to label 1M instances 

  • Check every worker’s result on that validation set
 
 

36


How to label 1M instances 

  • Fire the worst 50 workers
    • Disregard their results
 

37


How to label 1M instances 

  • Hire 50 new ones
 

38


How to label 1M instances 

  • Repeat for a month (20 working days)
    • 50 workers × 20 days × 1000 data points a day × 4¢
 

39


Got 1M labeled instances, now what? 

  • Now go train your model 
  • Rule of the thumb: heavier algorithms produce better results
  • Rule of the other thumb: forget about super-quadratic algorithms
  • Parallelization looks unavoidable 
 

40


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

41


Example: k-means clustering 

  • An EM-like algorithm:
  • Initialize k cluster centroids
  • E-step: associate each data instance with the closest centroid
    • Find expected values of cluster assignments given the data and centroids
  • M-step: recalculate centroids as an average of the associated data instances
    • Find new centroids that maximize that expectation
 

42


Parallelizing k-means 

43


Parallelizing k-means 

44


Parallelizing k-means 

45


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

46


Peer-to-peer (P2P) systems 

  • Millions of machines connected in a network
    • Each machine can only contact its neighbors
  • Each machine storing millions of data instances
    • Practically unlimited scale 
  • Communication is the bottleneck
    • Aggregation is costly, broadcast is cheaper
  • Messages are sent over a spanning tree
    • With an arbitrary node being the root
 

47


k-means in P2P 

  • Uniformly sample k centroids over P2P
    • Using a random walk method
  • Broadcast the centroids
  • Run local k-means on each machine
  • Sample n nodes
  • Aggregate local centroids of those n nodes
 

Datta et al, TKDE 2009 

48


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

49


Virtual clusters 

  • Datacenter-scale clusters
    • Hundreds of thousands of machines
  • Distributed file system
    • Data redundancy
  • Cloud computing paradigm
    • Virtualization, full fault tolerance, pay-as-you-go
  • MapReduce is #1 data processing scheme
 
 

50


MapReduce 

Mappers 

Reducers 

  • Process in parallel → shuffle → process in parallel
  • Mappers output (key, value) records
    • Records with the same key are sent to the same reducer
 

51


k-means on MapReduce 

  • Mappers read data portions and centroids
  • Mappers assign data instances to clusters
  • Mappers compute new local centroids and local cluster sizes
  • Reducers aggregate local centroids (weighted by local cluster sizes) into new global centroids
  • Reducers write the new centroids
 

Panda et al, Chapter 2 

52


Discussion on MapReduce 

  • MapReduce is not designed for iterative processing
    • Mappers read the same data again and again
  • MapReduce looks too low-level to some people
    • Data analysts are traditionally SQL folks 
  • MapReduce looks too high-level to others
    • A lot of MapReduce logic is hard to adapt
      • Example: grouping documents by words
 

53


MapReduce wrappers 

  • Many of them are available
    • At different levels of stability 
  • Apache Pig is an SQL-like environment
    • Group, Join, Filter rows, Filter columns (Foreach)
    • Developed at Yahoo! Research
  • DryadLINQ is a C#-like environment 
    • Developed at Microsoft Research
 

Olston et al, SIGMOD 2008 

Yu et al, OSDI 2008 

54


k-means in Apache Pig: input data 

  • Assume we need to cluster documents
    • Stored in a 3-column table D:
  • Initial centroids are k randomly chosen docs 
     
     
     
    • Stored in table C in the same format as above
 

Document 

Word 

Count 

doc1 

new 


doc1 

york 


55


D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQR BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

k-means in Apache Pig: E-step 

Document 

Word 

Count(D) 

Centroid 

Count(C) 

56


D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQR BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

k-means in Apache Pig: E-step 

57


D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQR BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

k-means in Apache Pig: E-step 

58


D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQR BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

k-means in Apache Pig: E-step 

59


D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQR BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

k-means in Apache Pig: E-step 

60


k-means in Apache Pig: E-step 

D_C = JOIN C BY w, D BY w;

PROD = FOREACH D_C GENERATE d, c, id * ic AS idic

PRODg = GROUP PROD BY (d, c);

DOT_PROD = FOREACH PRODg GENERATE d, c, SUM(idic) AS dXc

SQR = FOREACH C GENERATE c, ic * ic AS ic2;

SQRg = GROUP SQUA BY c;

LEN_C = FOREACH SQRg GENERATE c, SQRT(SUM(ic2)) AS lenc

DOT_LEN = JOIN LEN_C  BY c, DOT_PROD BY c;

SIM = FOREACH DOT_LEN GENERATE d, c, dXc / lenc

SIMg = GROUP SIM BY d;

CLUSTERS = FOREACH SIMg GENERATE TOP(1, 2, SIM); 

61


k-means in Apache Pig: M-step 

D_C_W = JOIN CLUSTERS BY d, D BY d

D_C_Wg = GROUP D_C_W BY (c, w);

SUMS = FOREACH D_C_Wg GENERATE c, w, SUM(id) AS sum

D_C_Wgg = GROUP D_C_W BY c;

SIZES = FOREACH D_C_Wgg GENERATE c, COUNT(D_C_W) AS size

SUMS_SIZES = JOIN SIZES BY c, SUMS BY c;

C = FOREACH  SUMS_SIZES  GENERATE c, w, sum / size AS ic

62


MapReduce job setup time 

  • In an iterative process, setting up a MapReduce job at each iteration is costly
  • Solution: forward scheduling
    • Setup the next job before the previous completed
 

Panda et al, Chapter 2 

Setup 

Process 

Tear down 

Setup 

Process 

Tear down 

Setup 

Process 

Tear down 

Data 

Data 

Data 

Data 

63


k-means in DryadLINQ 

Vector NearestCenter(Vector point, IQueryable<Vector> centers)

{

    var nearest = centers.First();

    foreach (var center in centers)

        if ((point - center).Norm() < (point - nearest).Norm())

            nearest = center;

    return nearest;

} 

IQueryable<Vector> KMeansStep(IQueryable<Vector> vectors,

                                                              IQueryable<Vector> centers)

{

   return vectors.GroupBy(vector => NearestCenter(vector, centers))

                             .Select(g => g.Aggregate((x,y) => x+y) / g.Count());

} 

Budiu et al, Chapter 3 

64


Takeaways on MapReduce wrappers 

  • Machine learning in SQL is fairly awkward 
  • DryadLINQ looks much more suitable
    • Beta available at http://blogs.technet.com/b/windowshpc/archive/2011/07/07/announcing-linq-to-hpc-beta-2.aspx
    • Check out Chapter 3 for a Kinect application!!!
  • Writing high-level code requires deep understanding of low-level processes
 

65


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

66


HPC clusters 

  • High Performance Computing clusters / blades / supercomputers
    • Thousands of cores
  • Great variety of architectural choices
    • Disk organization, cache, communication etc.
  • Fault tolerance mechanisms are not crucial
    • Hardware failures are rare
  • Most typical communication protocol: MPI
    • Message Passing Interface
 

Gropp et al, MIT Press 1994 

67


Message Passing Interface (MPI) 

  • Runtime communication library
    • Available for many programming languages
  • MPI_Bsend(void* buffer, int size, int destID)
    • Serialization is on you 
  • MPI_Recv(void* buffer, int size, int sourceID)
    • Will wait until receives it
  • MPI_Bcast – broadcasts a message
  • MPI_Barrier – synchronizes all processes
 

68


MapReduce vs. MPI 

  • MPI is a generic framework
    • Processes send messages to other processes
    • Any computation graph can be built
  • Most suitable for the master/slave model
 

69


k-means using MPI 

  • Slaves read data portions
  • Master broadcasts centroids to slaves
  • Slaves assign data instances to clusters
  • Slaves compute new local centroids and    local cluster sizes
    • Then send them to the master
  • Master aggregates local centroids weighted by local cluster sizes into new global centroids
 

Pednault et al, Chapter 4 

70


Two features of MPI parallelization 

  • State-preserving processes
    • Processes can live as long as the system runs
    • No need to read the same data again and again
    • All necessary parameters can be preserved locally
  • Hierarchical master/slave paradigm
    • A slave can be a master of other processes
    • Could be very useful in dynamic resource allocation
      • When a slave recognizes it has too much stuff to process
 

Pednault et al, Chapter 4 

71


Takeaways on MPI 

  • Old, well established, well debugged
  • Very flexible
  • Perfectly suitable for iterative processing
  • Fault intolerant
  • Not that widely available anymore 
    • An open source implementation: OpenMPI
    • MPI can be deployed on Hadoop
 

Ye et al, CIKM 2009 

72


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

73


Multicore 

  • One machine, up to dozens of cores
  • Shared memory, one disk
  • Multithreading as a parallelization scheme
  • Data might not fit the RAM
    • Use streaming to process the data in portions
    • Disk access may be the bottleneck
  • If it does fit, RAM access is the bottleneck
    • Use uniform, small size memory requests
 

Tatikonda & Parthasarathy, Chapter 20 

74


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

75


Graphics Processing Unit (GPU) 

  • GPU has become General-Purpose (GP-GPU)
  • CUDA is a GP-GPU programming framework
    • Powered by NVIDIA
  • Each GPU consists of hundreds of multiprocessors
  • Each multiprocessor consists of a few ALUs
    • ALUs execute the same line of code synchronously
  • When code branches, some multiprocessors stall
    • Avoid branching as much as possible
 

76


Machine learning with GPUs 

  • To fully utilize a GPU, the data needs to fit in RAM
    • This limits the maximal size of the data
  • GPUs are optimized for speed
    • A good choice for real-time tasks
  • A typical usecase: a model is trained offline and then applied in real-time (inference)
    • Machine vision / speech recognition are example domains
 

Coates et al, Chapter 18

Chong et al, Chapter 21 

77


k-means clustering on a GPU 

  • Cluster membership assignment done on GPU:
    • Centroids are uploaded to every multiprocessor
    • A multiprocessor works on one data vector at a time
    • Each ALU works on one data dimension
  • Centroid recalculation is then done on CPU
  • Most appropriate for processing dense data
  • Scattered memory access should be avoided
  • A multiprocessor reads a data vector while its ALUs process a previous vector
 

Hsu et al, Chapter 5 

78


Performance results 

  • 4 millions 8-dimensional vectors
  • 400 clusters
  • 50 k-means iterations
  • 9 seconds!!! 
 

79


Parallelization: platform choices 

Platform 

Communication Scheme 

Data size 

Peer-to-Peer 

TCP/IP 

Petabytes 

Virtual Clusters 

MapReduce / MPI 

Terabytes 

HPC Clusters 

MPI / MapReduce 

Terabytes 

Multicore 

Multithreading 

Gigabytes 

GPU 

CUDA 

Gigabytes 

FPGA 

HDL 

Gigabytes 

80


Field-programmable gate array (FPGA) 

  • Highly specialized hardware units
  • Programmable in Hardware Description Language (HDL)
  • Applicable to training and inference
  • Check out Chapter 7 for a hybrid parallelization: multicore (coarse-grained) + FPGA (fine-grained) 
     
 

Durdanovic et al, Chapter 7

Farabet et al, Chapter 19 

81


How to choose a platform 

  • Obviously depending on the size of the data
    • A cluster is a better option if data doesn’t fit in RAM
  • Optimizing for speed or for throughput
    • GPUs and FPGAs can reach enormous speeds
  • Training a model / applying a model
    • Training is usually offline
 

82


Conclusion 

  • Big DATA
  • Crowdsourcing Labeled Data
  • Parallelization: Platform Choices
    • Example on k-means clustering
    • Peer-to-Peer
    • Virtual clusters
    • HPC clusters
    • Multicore
    • GPU
    • FPGA
 
 

83


Thank You! 

http://hunch.net/~large_scale_survey 

84


Search more related documents:Scaling Up Machine Learning
Download Document:Scaling Up Machine Learning

Set Home | Add to Favorites

All Rights Reserved Powered by Free Document Search and Download

Copyright © 2011
This site does not host pdf,doc,ppt,xls,rtf,txt files all document are the property of their respective owners. complaint#nuokui.com
TOP