CS224N lecture03 笔记

发布于 2020-05-29  415 次阅读

Word Window Classification, Neural Networks, and Matric Calculus


  • Name Entity Recognition(NER)
    • The task:find and classify names in text
    • BIO encoding
  • Window classfication(for NER task)
    • classify a word in its context window
    • a simple way:average the word vectors in the window
    • concatenation of word vectors surrounding the center
    • Max-margin loss :J = max(0, 1-S+Sother),not differentiable but continuous → can use SGD
  • Matrix calculus
    • transposes;dimensions
    • shape convention


  1. Matrix calculus note
    • colomn,row vectors about transpose
    • "the shape of the gradient equals the shape of the parameter"
    • ReLU' = sgn
  2. Lecture note
    • Keyphrases:Neural Network;Forward computation;Backward propagation;Neuron units;Max-margin loss;Gradient checks;Learning rate;Xavier parameter initialization;Adagrad;
    • Maximum Margin Objective Punction
      • ensure and only care the "True" lable higher than "False";
      • max(0, Sc-S+Δ);
      • usually associated with SVM;
    • Tips and Tricks in Neural Network
      • Gradient check
      • Regularization(bias terms are not regularized)
      • Dropout(remember to control the input of the activation functon)
      • Neuron units
        • sigmoid:default
        • tanh
        • hard tanh:computationlly cheaper
        • soft sign
        • ReLu:try it at first most time
        • leaky ReLu
      • Data preprocessing
        • Mean subtraction(zero-center);
        • Normalization
        • Whitening
      • Parameter initalization
        emprical finding-uniform distribution,mantain activation and backpropagation variance
      • Learning strategies
        • Momentum updates
        • Adaptive Optimization Methods(AdaGrad)-relevant to the history of gradient
        • RMSprop;Adam

Suggested Readings

  1. Review of diferential calculus theory
    the graient is a vector while the differential is a function; tensor situation;
  2. CS231n notes on backprop
    not available