Hello!

Hands On Large Language Models

Summary A good handbook/manual that biases towards the how over the why, i.e., not a textbook. Easier to read if you already know some ML. A bit overboard on the illustrations. A great candidate for an online book that updates often. e.g., the RLHF section already feels a bit outdated. Understanding Language Models Introduction to LLMs Describes historical context and progress of work towards the current (2024) state. What is Artificial Intelligence - John McCarthy (2007) word2vec (2013) paper that introduced attention (2014) Attention Is All You Need (2017) BERT (2018) GPT-1 (2018) GPT-2 (2019) GPT-3 (2020) GPT-4 (2023) Llama 2 Tokens and Embeddings Tokenization levels: words, sub-words, characters and bytes. Considerations: vocabulary size, special tokens (start, end, padding, mask), capitalization, whitespace sensitivity (e.g. for coding) Token vs. sentence/doc embeddings: The embedding of the last token is not a good doc/sentence embedding Looking Inside LLMs kv-caching for speeding up inference Speeding up attention: Sparse transformers (2019),Longformer - sliding window attention (2020), multi-query attention (2019), grouped-query attention (2023), Flash Attention (2022), Positional Embeddings (RoPE) (2021) Packing multiple documents into a single context - https://arxiv.org/abs/2107.02027 and Packed BERT Using Pretrained Language Models Text classification Only covers using pretrained models. Start at Hugging Face’s Massive Text Embedding (MTEB) leaderboard has lots of models benchmarked across several tasks plus metadata on model size. Option 1: Find a model that is already trained for your classification task, e.g. RoBERTa for sentiment classification. Option 2: Get embeddings from a pre-trained embeddings model, e.g. from sentence-transformers, and train a classifier using the embedding as a feature vector. Equivalent to fine-tuning. Option 2.5: Zero-shot with embeddings of the labels. e.g. for a movie `M` with embedding `e(M)`, compute cosine similarity of `e(M)` to `e(“this is a positive movie review”)` and `e(“this is a negative movie review”)`. Surprisingly, this works reasonably well (0.78 f1-score vs 0.85 for option 2 above) on a movie review sentiment classification task. Code here. Option 3: Ask an instruction fine-tuned generative model, e.g. Flan-T5 trained by using instruction fine-tuning T5 (encoder-decoder). “Is the following sentence positive or negative? ” gives an f1-score of 0.84. Same with gpt-3.5 gives an f-1 score of 0.91. Text clustering and topic modeling Generate embeddings for docs with a model from Hugging Face’s Massive Text Embedding (MTEB) leaderboard, project to a lower dimension using PCA/UMAP and cluster using k-means/HDBSCAN. BERTopic: Various algorithms to generate topic representations on top of clusters. Prompt engineering In-context learning: Give 1 or more examples in the prompt. Chain prompting: Manually break up the task into multiple steps and chain outputs/inputs, e.g. to output a book, prompt with a topic to output a title, prompt with the generated title to output a summary, and prompt with the summary to output a book. Chain-of-thought: (1) give reasoning example(s) in the prompt; (2) prompt with “let’s think step-by-step”. Self-consistency: Sample multiple outputs and pick the majority/most popular. Tree-of-thought: multiple steps of reasoning and verification or pretend to have a conversation between multiple experts output validation: constraint output format using a prompt or by restricting the output tokens during sampling (see Guidance, Guardrails and LMQL). Advanced Text Generation Techniques Using quantized models in GGUF format. Chaining LLM calls with LangChain. Adding memory with full conversation buffers or conversation summaries. Tool usage with ReAct in LangChain (already deprecated!). Semantic Search and Retrieval-Augmented Generation (RAG) Dense retrieval: Chunk document, get embeddings for each chunk, embed the query, and find the closest chunks. Use FAISS/Annoy to scale up nearest-neighbor searches. Re-ranking: Concat query and doc and pass as inputs to an encoder-style model trained to output 0/1 relevance scores. RAG: Find relevant docs/chunks using embedding search, include them in the prompt and instruct an LLM to refer to them and/or cite. Multimodal LLMs ViT CLIP - multimodal embeddings. Also, OpenCLIP. BLIP-2 - multimodal (input) generator. Use for image captioning and queries about images. Training and Fine-Tuning Language Models Creating Text Embedding Models Sentence-BERT/SBERT - 2-tower/Siamese network for contrastive learning of embeddings. Prior art was a cross-encoder but that is clearly much more expensive. “A solution to this overhead is to generate embeddings from a BERT model by averaging its output layer or using the [CLS] token. This, however, has shown to be worse than simply averaging word vectors like GloVe”. - Interesting claim. SBERT uses mean pooling. Why isn’t this a problem? Training an embedding model - Fairly straightforward. Some choices of loss functions. Fine-tuning: Same as (2) but start with a pretrained embedding model. Augmented SBERT: Generate training labels for an SBERT style model using a cross-encoder model. Unsupervised training to learn embeddings - TSDAE uses a setup very similar to masked language modeling but the decoder only gets to see a (pooled) sentence embedding instead of token embeddings. Can be adapted to a domain but doing a supervised fine-tuning round on top of a pretrained model. Fine-tuning Representation Models for Classification Fairly straightforward - Fine-tune a pretrained BERT model for a classification task by unfreezing one more layers. SetFit: Surprising process: (1) Make a dataset of positive and negative sentence pairs from a labeled dataset; (2) Fine-tune a Sentence Transformer on it; (3) Learn a classifier on the fine-tuned embeddings. Why would this work any better than doing (3) directly? Fine-tuning for Named Entity Recognition: Fairly straightforward but need to carefully align the word-level entity labels with tokens. Fine-tuning Generation Models Supervised Fine-tuning: Full fine-tuning and Parameter Efficient Fine-tuning using adapters or LoRA Preference-tuning/Alignment/RLHF: PPO:(1) Train a copy of the LLM to predict rewards based on a human preference dataset; (2) Fine-tune the original LLM using rewards from the reward model. DPO

March 21, 2025

30 Lessons For Living - Karl Pillemer

A book with practical advice on life, relationships, careers, and happiness, based on surveys and interviews with a large number of older Americans. Great Together - Lessons for a happy marriage Marry someone a lot like you. Similarity in core values and background is the key to a happy marriage. And forget about changing someone after marriage. Friendship is as important as romantic love. Don’t keep score. Don’t take the attitude that marriage must always be a fifty-fifty proposition; you can’t get out exactly what you put in. The key to success is having both partners try to give more than they get out of the relationship. “The only way you can make a marriage work is to have both parties give 100 percent of the time. … The attitude has to be one of giving freely. And if you start keeping score, you are already in trouble.” When you wake up in the morning, think, “What can I do to make her day or his just a little happier?” Mornings in my house tend to be rushed, and with two busy people there’s a temptation to look out for oneself, especially when stress gets the better of you. … Can we leave when it fits my schedule? Can I stay late today while you go to the grocery store? … If leaving fifteen minutes later or picking up the milk on the way home accomplishes that, why not do it? It definitely puts a different cast on the day. Talk to each other. We all need to learn how to fight. Fights are inevitable; it’s how we handle them that matters. " … just because you have a fight, it’s not the end of things … Ten minutes later you forget about it. As you get older, it becomes five minutes" Tip 1: If you are having trouble discussing something, get out of the house. Tip 2: Find a way to blow off steam, and then engage with your partner. Tip 3: Watch out for teasing. Tip 4: Let your partner have his or her say. Don’t just commit to your partner - commit to marriage itself. Rather than view marriage as a voluntary partnership that lasts as long as the passion does, the experts see it as a profound cultural arrangement that we should respect, even if things go sour over the short term. “Any relationship will go through dark times as well as bright times, so that the high points are richly enjoyed, but there are going to be valleys that you are going to have to trek through and not give up. … Look it will be a struggle, but it has to be a struggle or else it’s not a fully lived life”. Postscript: Don’t go to bed angry. The end of the day means that very soon someone is going to have the last word and someone will be deeply hurt, and there will be nowhere further along the road to travel. Most things that couples disagree upon aren’t worth more than a day’s combat. Glad to get up in the morning - Lessons for a successful and fulfilling career Choose a career for the intrinsic rewards, not the financial ones. A sense of purpose and passion for one’s work beats a bigger paycheck any day. Don’t give up on looking for a job that makes you happy. Make the most of a bad job. salvaging a less-than-ideal job by becoming really good at it. Emotional intelligence trumps every other kind. no matter how talented you are, no matter how brilliant - you must have interpersonal skills to succeed. … traits like empathy, consideration, listening skills, and the ability to resolve conflicts are fundamentals in the workplace. “I had the attitude that I might have certain skills but mostly everybody here knows more than I do. … if I’m going to add value, it’s going to be by making use of these people or by collecting information from them or marshaling what it is they’re doing. This means that whatever they give me to do I’m going to try to do it to the best of my ability, working with whomever I have to work with.” Everyone needs autonomy. Postscript: The first thing in the morning. When it comes to evaluating your career, the experts collectively arrived at this … diagnostic test: do I wake up in the morning looking forward to work? “No amount of money is worth more than having a job that you’re glad to get up and go to every morning, instead of one you dread.” “If you can’t wake up in the morning and want to go to work, you’re in the wrong job.” Spending years in a job you dislike is a recipe for regret and a tragic mistake. Nobody’s perfect - lessons for a lifetime of parenting. It’s all about time. Spend more time with your children … sacrifice to do it. Your kids don’t want your money (or what your money buys) anywhere near as much as they want you. Specifically, they want you with them. If you and your spouse work seventy-hour weeks to buy consumer goods and take lavish vacations, you are misusing your time. Our kids are often closed up tightly like clamshells, hard on the outside but with a soft and vulnerable interior. Suddenly and unexpectedly, however, they will decide to open up, and if you’re not there … you might as well be on the moon. … time shared in mundane daily activities and interactions rather than memorable “special occasions.”. … involve your children routinely in activities, and that requires your physical presence for large blocks of time. “It’s more important to devote your time to whatever they’re interested in. Otherwise you’re going to lose them. They’ll become strangers”. “I can remember riding home with the kids in the car and being so involved in my mind, going over what had happened during the day and what I should be doing the next day, that I didn’t hear those little voices and what they were sharing with one another and with me.” It’s normal to have favorites but never show it. Don’t hit your kids. Avoid a rift at all costs. See the potential rift early and defuse it. Act immediately after the rift occurs. When all else fails, it’s the parent who usually needs to compromise. Take a lifelong view of relationships with children. The years of raising young children and adolescents … are … a blur, a rush of activity so hectic … it seems to have passed in an instant … But … most of the time we spend as parents is not when kids are dependents in the family home but when they are adults … Parents need to keep in mind what comes after. What are you doing when your child is age five, ten, or fifteen that will create a lasting, loving relationship over the much longer time of his or her adulthood and your middle and old age? As your life goes on, you will want your children there … When you are in your seventies and beyond, your children provide you with continuity, meaning, attachment, and ultimately an overarching sense of a greater purpose in life. .. You’ve made the investment. From midlife on, you will deeply desire … the “payoff”. As you make decisions regarding child rearing, think about the payoff in the long run. Consider actions towards your children in the long term. When you are in your later years, you are likely to have one simple desire … that they like you and wish to be around you. … actions that get in the way of that future should be vigorously avoided. Postscript: Abandon perfection … most parents hold themselves up to some kind of perfect standard when they evaluate their parenting. In addition we hold up children to impossible standards, comparing them to ideals of well-behaved, hardworking youngsters that exist in our imaginations alone … (but) no one has perfect children … The reassuring thing is that most kids turn out pretty well nevertheless. Being a good-enough parent means allowing kids to fail. “My husband and I have the same attitude about our kids: we put them in situations where they could make decisions, and they didn’t always make the right ones but they learned from their mistakes and that’s important.” relax your expectations and assume that failure is inevitable at all times. Dealing with problems in a supportive ways is what counts, not an ideal of perfection. none of the five lessons requires perfection, just openness, the ability to listen, and good intentions. All these are qualities all parents can develop. Find the Magic - Lesson for Aging Fearlessly and Well Being old is much better than you think Don’t waste your time worrying about getting old. It can be a time of opportunity, adventure, and growth. See it as a quest, not an end. Act now like you will need your body for a hundred years. It’s not dying (from not staying healthy) that you should be worried about - it’s chronic disease. What you can expect from not making the right health decisions isn’t an early death - in fact, that’s the least of your worries - instead you should be concerned about years, possibly decades, of suffering from chronic disease. Don’t worry about dying - the experts don’t. Don’t spend a lot of time fretting about mortality … recommend is careful planning and organization for the end of life. Stay connected The Alameda County Study showed that the absence of social ties predicted dying among older persons, even when taking into consideration things like social class and health status. Tip 1: Take advantage of learning opportunities Being interested in the world around you and choosing to learn more about something that you are curious about stimulates the mind. Tip 2: Make a conscious goal of staying connected. set specific goals that will lead to greater connectedness Learn to be social. Enjoy the people around you - don’t criticize them so severely. become aware of how our network can shrink in mid-life and take steps to stay connected Plan ahead about where you will live (and your parents too) Don’t let fears deter you/older relatives from considering a move to a senior living community. Such a move opens up opportunities for better living, rather than limiting them. Postscript - don’t fight it. People who age successfully select the activities they most value and optimize the returns they get from them. on running: You realize that if you can’t be running this fast, well, you just go slower, but you keep on running. Do what you’re able to do and accept that there might be some limitations. I can look everyone in the eye - Lessons for living a life without regrets Always be honest. Say yes to opportunities. Travel more. Choose a mate with extreme care. Say it now. Leaving critical things unsaid or unasked, … can’t be changed after the person is gone. … a conversation is a great regret-prevention strategy. Postscript: Go easy on yourself regarding mistakes and bad choices you have made. Choose happiness - lessons for living like an expert Time is of the essence. Life is short. The problem for younger people is in the “mechanics” of acting on this awareness. If it is true that we will be keenly aware of the shortness of life when we reach the end, what should we do? Take advantage of every day you are given. “carpe diem” - the meaning of the original Latin - “harvesting” the day. Each day has an unharvested abundance of pleasure, enjoyment, love, and beauty that many younger people miss. A very common human failing is not taking advantage of life’s pleasures and attending to the very joy of being alive. “There are no wheelchair ramps to the bottom of the Grand Canyon, so if you want to get down there, you have to go when you’ve still got two little feet.” Skip the funerals and see your friends now. ##Strive for happiness with what we are given, right now, and make this perspective a daily habit. This attitude is the gift we receive from awareness that life is short.## Happiness is a choice, not a condition … happiness requires a conscious shift in outlook in which one chooses - daily - optimism over pessimism, hope over disillusionment, and openness to pleasure and new experiences over boredom and listlessness. The dominant perspective among the young says: “I will be happy if only I …”. if I lose weight, find a mate, get healthy, get rich, and on and on. … Such a “happy if only” attitude is futile and will inevitably lead to disappointment. (but) researchers tell us that changes in our circumstances - getting that great job, the move you’ve dreamed about, even getting married or winning the lottery- only give us a temporary bump in our happiness level. So for the all if-onlys we set our sights on, there is a best a short-term boost in our happiness level. We can make a conscious decision each day to embrace a positive attitude. It requires convincing yourself that you can wake up and decide to focus on positive emotions. “You are completely in control of your attitude and your reactions … if you feel annoyance, fear, or disappointment, these feelings are caused by you and must be dug out like a weed. Study where they came from, accept them, and then let them go.” Time spent worrying is time wasted. “Don’t worry. There’s never an excuse to worry, and it makes it impossible for you to act appropriately.” Tip 1: Focus on the short term rather than the long term. “This too will pass”. “It’s a good idea to plan ahead if possible, but you can’t always do that because things don’t always happen the way you were hoping they would happen. So the most important thing is one day at a time.” Tip 2: Instead of worrying, prepare. (there is) a distinct difference between worry and conscious, rational planning that greatly reduces worry. It’s the free-floating worry, after one has done everything one can about a problem, which seems so wasteful. Tip 3: Acceptance is an antidote to worry. “Life is good. … If I’m going to die tomorrow, then I’ll die tomorrow. How else can you live? Life is short, you have to be open-minded. … Learn to accept instead of worrying-then you will be okay.” A critically important strategy for regret reduction is increasing the time spent on concrete problem solving and drastically eliminating time spent worrying. One activity enhances life, whereas the other is deeply regretted as a waste of precious time. Think small. Become attuned to the minute pleasures that younger people often are only aware of if they have been deprived of them: a morning cup of coffee, a warm bed on a winter night, a brightly colored bird feeding on the lawn, an unexpected letter from a friend, even a favorite song on the radio. “Go about the business of the day, but walk on your tip toes, waiting for the “aha!” experiences. That way you’re always open to and watching for something different.” “There is a lot to be gained from just being able to be in the moment and able to appreciate what’s going on around you right now, this very second … It brings peace. It helps you find your place. It’s calming in a world that is not very peaceful. But I wish I could have learned this in my 30s instead of my 60s- it would have given me decades more to enjoy life in this world. That would be my lesson for younger people.” Experience a sense of gratefulness for the fact of being alive and for the innumerable simple pleasures that are available in any given day or hour. Most of us will almost certainly develop this ability late in life; a question to ask ourselves is why not create a savoring approach to life in one’s 20s or 30s rather than in one’s 80s or 90s? Have faith There are two main reasons why practicing it faith is an important lesson for living: it provides a source of community, and it offers unique help with coping in times of trouble. Postscript: The Golden Rule Do to others what you want them to do to you. “Who have you helped? What circles do you move in? Who likes you? Some people I’ve known, they never helped anybody. They never did anything. They were never in any circles - they live their own life totally unto themselves. You know what? Nobody would go to their funerals. It would be as though they never passed by on earth. They didn’t make any ripples. They didn’t interact or help or do anything to build up anyone else.” The Last Lesson Listen to the experts (elders) in your life. One reason why we don’t ask older people for advice and wisdom is because we don’t see them much. We have moved away from a time when people lived together in multigenerational households. Today many older people live alone and children tend to be geographically dispersed. … Studies show that almost all of our friends are within 10 years of our own age, and many are within five years … The first step in breaking down the age barriers is to talk with one another. I strongly recommend that you spend some time asking older people in your social network the kinds of questions that were asked in this book or you can take this book to elders you care about and ask them if they agree with the lessons here. This is how knowledge for living was once transferred: the experience of interlocking lives, intertwined over generations, was passed along and remained alive in the telling. This wisdom exists in people you know, right here, right now. And it’s yours for the asking.

October 23, 2024

Slow Productivity - Cal Newport

A non-standard productivity book productivity that in reality is based on the author’s personal experiences but has a bunch of “studies” and anecdotes to make it look more authoritative. Do Fewer Things Limit the Big. Limit missions. Limit projects. Limit daily goals. Contain the Small. Put tasks on autopilot Synchronize - reduce interruptions and process similar items in a batch by using office hours and docket-clearing meetings. Make other people work more. Avoid “task engines” - projects that generate lots of small tasks/lots of communication overhead. Spend money: delegate/hire people to do small things for you. Pull instead of push work. How to simulate a pull based process: Setup a holding tank and active items list. Intake procedure: new requests go into the holding tank by default along with a reply communicating the ETA. List cleaning: update ETAs, update priorities, update stakeholders on new ETAs. Work at a natural place. Take longer. Make a five-year plan. Double your time estimates. Simplify your work day (cut your daily task list in half, block time off for deep work). Forgive yourself when slow productivity causes missed opportunities. Step back and recalibrate. Embrace seasonality. Schedule slow seasons. Define a shorter work year. Implement “small seasonality”. No Meeting Mondays (or Tuesdays or Fridays). See a matinee once a month. Schedule Rest projects. Work in cycles (Basecamp does 8 weeks on; 2 weeks cooldown). Work poetically (bad title). Match your space to your work (no specific advice here). Strange is better than stylish. Separate remote work from working from home. “When we pass the laundry basket outside our home office, our brain shifts towards a household-chores context”. “When seeking out where you work, be wary of the overly familiar”. Rituals should be striking. Form your own rituals around the work you find most important. Ensure your rituals are sufficiently striking to effectively shift your mental state into something more supportive of your goals. Obsess over quality. Improve your taste (judgment). Become a cinephile (or any other domain). Start your own inklings - form a group of like-minded professionals, all looking to improve what they’re doing. Buy a $50 notebook - invest in your tools. Bet on yourself Write after the kids go to bed - devote precious, uncluttered time to your main pursuit. Be okay with reducing your salary. Announce a schedule for getting something done - social commitment. Attract an investor - someone willing to pay for the quality of your output.

September 17, 2024