The Kitchen Sink and Other Oddities

Atabey Kaygun

Faculty Networks and Inequality in Hiring Practices in Universities

I came across this paper which is a detailed study of hiring networks and discrimination practices in hiring of faculty among a large number of universities in the US. They also provided the data for other researchers to download and play.

How I constructed the graphs

What I did was to graph all hiring data as a weighted directed graph. If university \(y\) hired someone from university \(x\), the weight of the directed edge \(x\to y\) is increased by 1. The final graphs are huge. So, I only pictured those edges whose weight are above certain value and removed “All others” (which I am guessing, indicates Universities not from Northern American continent).

Computer Science (with threshold value 1.5)

Image File

The graph is very much like a tree. Almost a pyramid, one can say. You should probably play with several threshold values to see what Iam about to observe:

  1. There is a tight clique between MIT, Stanford, Berkeley and Carnegie Mellon. They hire from each other a lot.
  2. There is a strong network between Canadian universities, but this is expected as by Canadian Law, Canadian universities must show a preference to Canadian citizens and permanent residents.  There are two groups: Universities in the West (British Columbia, Calgary and Saskatchewan) and the East (Toronto, Waterloo, Ottowa, York, Queens, McMaster). University of Alberta, even though it is in the West, has strong connections with Toronto and Waterloo.  Simon Fraser too exhibits a similar behavior with Toronto.  University of British Columbia, on the other hand, is more open to hiring from the US and East Canadian Universities.

History (with threshold value 2.0)

Image File

The network is a lot more connected and complicated than the network for computer science, but the pyramid-like structure is still present.

Business (with threshold value 1.2)

Image File

The network is as not as connected compared to history or computer science, but the pyramid-like structure is again present.

The lisp code

Here is the code. If you are going to run the code, here are the instructions:

  1. Put your data files in a directory called “data” and the following code in “src”. Rename the files you downloaded as “faculty-inequality-cs-vertices.csv”, “faculty-inequality-cs-edges.csv” for Computer Science. You should replace “cs” with “bus” and “hist” for the other files.
  2. I prepared the code to be run on the command line with SBCL.You could hard code which files you are going to process. The part with “*posix-argv*” works only with SBCL. Make the appropriate changes in the main section.
  3. If you don’t want “All others” (foreign universities, I assume) as I did, grep out those.
  4. The code prints out the “.dot” file to be processed with graphviz, which you should do separately. So, pipe the output into a “.dot” file and run one of the graph generating functions in the graphviz package.
  ;; from the plain data files creates an assoc list to be fed into
  ;; `dot-write`

  (defun weighted-edges (edges vertices attribute)
    (let (res y)
      (dolist (x edges res)
        (setf y (mapcar (lambda (i) (elt (assoc i vertices) attribute)) x))
        (if (assoc y res :test #'equal)
            (incf (cdr (assoc y res :test #'equal)))
            (push (cons y 1) res)))))

  ;; process the vertices file

  (defun process-vertices (filename)
    (mapcar 
     (lambda (x) (mapcar (lambda (i j) (funcall j i))
                         x
                         (list 
                          #'read-from-string
                          #'read-from-string
                          (lambda (i) (if (string= i ".") 0 (read-from-string i)))
                          (lambda (i) (if (string= i ".") 0 (read-from-string i)))
                          (lambda (i) i)
                          (lambda (i) i))))
     (read-file filename)))

  ;; process the edge file

  (defun process-edges (filename)
    (mapcar 
     (lambda (x) (mapcar (lambda (i j) (funcall j i))
                         (list (car x) (cadr x)) 
                         (list #'read-from-string #'read-from-string)))
     (read-file filename)))

  ;; reads data file into an assoc list

  (defun read-file (name)
    (let (res)
      (with-open-file (in name :direction :input)
        (read-line in)
        (do ((line (read-line in nil) (read-line in nil)))
            ((null line) res)
          (push (cl-ppcre:split #\Tab line) res)))))

  ;; The graph is weighted and represented as a assoc list where items
  ;; are of the form '((a b) . n)' where a is the source b is the target
  ;; and n is the count. I also pass a threshold value. If the weight is
  ;; below a certain value, it will not show up in the graph as the
  ;; graph is very large.

  (defun dot-write (graph threshold)
    (let* ((num (mapcar #'cdr graph))
           (a (reduce #'min num))
           (b (reduce #'max num)))
      (format t "digraph transition {~% node[shape=\"rectangle\"];~%")
      (dolist (edge graph)
        (let ((weight (* 10 (/ (+ (cdr edge) (- a)) (- b a)))))
          (if (> weight threshold)
              (format t 
                      "  \"~a\" -> \"~a\" [penwidth = ~2,1f];~%" 
                      (string-trim '(#\Space) (caar edge))
                      (string-trim '(#\Space) (cadar edge))
                      weight))))
      (format t "}~%")))

  ;; the main part of the code
  
  (let* ((base (concatenate 
                'string 
                "data/faculty-inequality-" 
                (elt *posix-argv* 1)))
         (vertices (process-vertices (concatenate 'string base "-vertices.csv")))
         (edges (process-edges (concatenate 'string base "-edges.csv"))))  
    
    (dot-write (weighted-edges edges vertices (read-from-string (elt *posix-argv* 2)))
               (read-from-string (elt *posix-argv* 3))))

How about mathematics?

AMS has the necessary data but doesn’t disclose it for public use. I would very much like to see the network.

How is this useful?

If you are about to go into PhD, screw the US News Rankings. Get the hiring network to determine your chances of being hired, and if there is any, by whom.