How to Start Lifting Weights: A Guide for Gen X

I’ve recently had a few friends ask me for advice on how to back into the weight room after years or decades off, or how to get started as an over-40 person. I put together the first version of this guide for one of those friends, an extremely successful tech executive in his early 50s with some athletic background who has been out of the weight room for many years. For this post, I’ve updated and expanded on the original version to make it more accessible to a general audience.

Who this guide is for: This is a guide for people in their late 30s – late 50s who are either beginners in the weight room or who have taken a long time away from lifting and want to get back into lifting. It will work equally well for men and women. The goal is to build a balanced foundation of movement competence and strength to support daily life and self-sufficiency. If you follow this program consistently, you’ll become one of the fittest people you know, and you will be able to go hiking with the teenagers, chase the grandkids, help your parents and siblings, do your own yard work (if you like that!), and take great care of yourself and your loved ones. These are all great goals. The program is designed with longevity, safety, reliability, and consistency in mind. It is designed to be accessible to all levels of fitness and to novices and beginners.

What this guide isn’t. This isn’t a fully comprehensive guide on the topic; that would require a book-length treatment. I don’t spend a lot of time on exercise descriptions as there are decent videos available from reputable websites that cover how to safely and effectively perform the movements at levels from beginner to elite athlete. I also don’t provide a comprehensive list of exercises; there are literally hundreds of exercises and variations available. For a good comprehensive resource for exercises, Arnold’s Encyclopedia of Bodybuilding is an excellent place to start. Or jump on YouTube and start searching, but caveat emptor, it can be hard to separate the wheat from the chaff. (I will list several of the sources that I trust for lifting advice at the end of the post.) This also isn’t an intermediate, advanced, or elite program. It isn’t designed to build or test a 1-rep max on any exercise, and I recommend against doing that until one is more advanced.

This is a long post. There is a lot that can be learned in pursuing fitness, and I only scratch the surface in this post. It is organized in a way that I hope will make it easily accessible. Section 1 provides an overview of the basic movement patterns and lists of exercises for each pattern by experience level. Section 2 provides some general rules for how to approach learning and executing the exercises for both machines and free weights. Section 3 provides three example programs – one for two days/week of lifting, one for three days/week of lifting, and one for four days/week of lifting. Section 4 covers how to get started, how to progress into the routines, how to make progress week to week, and when to get help. Section 5 briefly covers how to integrate this program with cardio and interval (HIIT) training, nutrition and recovery, what to do when you get sore, and what to do if you have old injuries or develop new ones.

Section 1: Basic Movement Patterns and Exercise Selection

In the programs below, I don’t list specific exercises, but rather movement patterns. Everyone’s physical proportions are different, so some exercises work better/feel better/are less injurious for each of us. The key is to keep our strength balanced by apportioning the work across the relevant movement patterns. For our purposes, we’ll group exercises into seven movement patterns. Within each pattern, I list basic, intermediate, and advanced exercises. If you have some previous experience with lifting, go where you are comfortable. If you find that it’s harder than you remember, you can step back a level until you find your footing again. If you are a beginner at lifting, start with the Basic Level exercises and progress to intermediate and advanced versions as you gain strength and confidence.

1. Hip Hinge Movement Pattern: In the hip hinge, the back is neutral and static, the legs are neutral and static, and all movement is around the hip joint. It is the foundation of a proper deadlift and primarily engages the muscles of the hip (glute max and hamstrings), along with the spinal erectors in a static contraction. *The booty note: It has recently become quite popular, especially with women, to focus on developing the gluteus muscles. Although technically not strictly a hip hinge movement, I’ll include them here anyhow. If you want to round out your tush, add a glute movement to the routines below, but don’t drop the Hip Hinge movement from your routine.

Basic Level Hip Hinge Exercises:

  • 45-degree hip (back) extension: Video demo here. I’d rather you focus on your glutes than hamstrings on these, at least until your hamstrings build some strength.
  • Dumbbell Romanian Deadlift: Video demo here. Focus on pushing the hips back with a slight bend in the knees. Lower until hamstring flexibility limits movement. Focus on the glutes to raise back up. 
  • *Glute: various Glute machines, usually of the kick-back variety.

Intermediate Level Hip Hinge Exercises: 

  • Weighted 45-degree hip (back) extension. As above, but with a weight plate held at the chest or a plate/barbell held at arm’s length vertically below shoulders. Start adding weight when 20 unweighted is easy.
  • Deadlifts. Deadlifts are worthy of a long post all by themselves. At their simplest, they consist of placing a barbell on the floor directly in front of you, reaching down and grabbing it, and then standing up to a position of attention. If you have significant experience deadlifting with a coach, you should be safe on these. If they are new to you, find a good coach to help. 
  • *Glute: Hip thrust machines, or hip thrusts on the Smith machine.

Advanced Level Hip Hinge Exercises:

  • Romanian Deadlifts with a barbell. RDLs should be thought of as starting at the top of the movement, descending to the limit of hamstring mobility, then driving back up from the hips.
  • Stiff-Legged Deadlifts with a barbell. SLDLs are not the same as RDLs. SLDLs start on the floor, like a normal deadlift, but we use a stiff-legged (legs fixed at 10-20 degree bend in the knees) movement pattern (a hip hinge) to lift the weight.
  • Heavy Deadlifts.
  • Barbell good mornings. Same movement pattern as an RDL, but with the barbell across the back as in a back squat.
  • *Glute: barbell or dumbbell hip thrusts, wide stance squats.

2. Squat Movement Pattern: As in the hip hinge, the back is neutral and static, but now we squat down/back roughly as if sitting in a chair until the knees are past 90 degrees, then stand back up. There are many, many variants of the squat, here are a few of the more useful ones.

Basic Level Squat:

  • Machine leg press. Pin selector machines are generally ok. 45-degree leg press sleds are better. Pendulum squat machines are even better. Avoid any machines where you lie on your back and press straight up – too much unnecessary risk. Aim to ensure your range of motion gets the knees bent to greater than 90 degrees. It’s easy to ego lift on these with a shortened range of motion. This is the path to injury. On the other hand, for most people, bringing the knees all the way to the chest is too much range of motion, and shifts load to the lower back in a suboptimal position. 

Intermediate Level Squat:

  • Smith machine squat and variants. Back squats, front squats, sissy squats, feet forward back squats, etc. The most common errors here are not aligning your body to the machine, and not facing the correct direction for safety. For safety, if you lose control of the bar, make sure that it rotates in the direction that causes the safety catches to engage. Lots of people orient to the mirror instead of the safety. It’s loud, scary, and sometimes entertaining when they get stapled under the weight.
  • Goblet squats.
  • Bulgarian Split Squats with dumbbells. Most people only need to hold one dumbbell, and it should be in the hand on the same side as the working leg. These require some balance and practice. I recommend grabbing a piece of 1-inch PVC pipe or a broom handle (most gyms have these stashed in a corner somewhere) and using it for balance in the non-engaged hand.

Advanced Level Squat: 

  • Barbell squats, high bar or low bar
  • Barbell front squat
  • SS Bar/Yoke bar squats. These are especially useful if you have shoulder mobility limitations or elbow/wrist injuries that make it difficult or painful to back squat.

3. Vertical Pull Pattern: This pattern is simply pull-ups and variations. These can be done with a variety of grips, etc. to target the upper back and latissimus dorsi muscles (lats). Wider grips hit the rhomboids and teres major/minor more effectively, while narrower grips hit the lats more effectively. Pronated grips (palms away) put more stress on the back and less on the biceps; supinated grips (palms toward the face) shift some of the stress from the back to the biceps. Neutral grips (palms facing each other) are also available and can be useful if the other grips cause elbow issues.

*A note on pull-ups. These sometimes get a reputation as a big scary monster. The average American man can’t do one; and they tend to be a bit harder for women than for men since men tend to have more upper body muscle mass than women. I believe that anyone can build the strength to do pull-ups. Doing so may require progressing and building the components of the exercise, identifying and attacking weaknesses, possibly losing some weight, and some time and grit. But it’s like any other skill. Absent a limiting injury or illness, I believe anyone can get there. There are lots of good programs out there for getting your first pull-up. There are also some that are rubbish. A couple of general rules of thumb. Chin-ups are easier for most beginners than pull-ups. A shoulder-width grip (or narrower if comfortable) is easier for most beginners than a wide grip. If you can do a reasonably strict cable pull-down with about 80-85% of your body weight, you are probably strong enough to do a pull-up. Finally, pull-ups respond better than most exercises to training the eccentric movement, that is climbing or jumping to the top of the movement and then lowering yourself as slowly as possible.

Basic Level Vertical Pull:

  • Cable Lat pull-downs. 
  • Machine pull-downs.

Intermediate Level Vertical Pull:

  • Chin-ups
  • Pull-ups

Advanced Level Vertical Pull:

  • Weighted chin-ups
  • Weighted pull-ups

4. Horizontal Pull Pattern: Rows and variations. Can be done with a variety of grips, angles, and implements of destruction.

Basic Level Horizontal Pull:

  • Cable rows. In order of difficulty: narrow grip (easiest) – medium grip – wide grip (1.5x shoulder width). A pronated (overhand) grip is preferred for back development. A supinated (underhand) grip favors more bicep involvement at expense of back development.
  • Machine rows. Hammer Strength has a series of pull-down and row machines that are plate loaded and work pretty well. You’ll have to figure out how to align your body correctly to the machine. Caution: you’ll see a lot of ego lifting on these by guys doing one arm at a time and using a lot of body English. The fix is to do both arms simultaneously.

Intermediate Level Horizontal Pull:

  • Single-arm dumbbell rows. Minimize trunk twist involvement on these.
  • Chest-supported dumbbell or barbell rows.

Advanced Level Horizontal Pull: 

  • Barbell rows.
  • Pendlay rows.
  • Kroc rows.

5. Vertical Presses. In a vertical press, the line of force is from shoulder height to overhead, vertical relative to your spine. These are shoulder presses and variants.

Basic Level Vertical Presses:

  • Machine shoulder presses. These can be the pin-selector type machine, or the plate-loaded lever-type machines (such as Hammer Strength)

Intermediate Level Vertical Presses:

  • Dumbbell shoulder presses
  • Arnold presses
  • Barbell shoulder presses on the Smith Machine
  • Seated Barbell shoulder presses

Advanced Level Vertical Presses:

  • Standing barbell shoulder presses

6. Incline Presses. In an incline press, the line of force is between 30 and 60 degrees from vertical/horizontal. For most people, 30 degrees is recommended. Steeper inclines tend to present more risk of shoulder impingement. Some builds will favor a shallower incline of 10-20 degrees. Most gyms will have both fixed incline benches and adjustable incline benches.

Basic Level Incline Presses:

  • Machine incline presses. These can be the pin-selector type machine, or the plate-loaded lever-type machines (such as Hammer Strength)

Intermediate Level Incline Presses:

  • Dumbbell incline presses
  • Barbell incline presses on the Smith Machine

Advanced Level Incline Presses

  • Barbell incline press

7. Flat (Bench) Presses. In a flat press, the line of force is perpendicular to your spine.

Basic Level Flat Presses:

  • Machine Flat (Bench) Presses. These can be the pin-selector type machine, or the plate-loaded lever-type machines (such as Hammer Strength). 

Intermediate Level Flat Presses:

  • Dumbbell Bench Presses
  • Bench Presses on the Smith Machine

Advanced Level Flat Presses:

  • Barbell Bench Presses

Section 2: General Rules for Learning Your Way Around the Gym

If you are a beginner, or even if you used to lift but are just getting back into it, the array of equipment at the average gym can be overwhelming. Even on machines, where the movement pattern is defined to a large degree by the design of the machine itself, the diagrams aren’t usually very helpful and it can be hard to know how to set the machine up the right way for your unique size and proportions. This section gives you some rules of thumb for barbells, dumbbells, and machines that will help you find your way.

Single joint and compound movements.
For our purposes, we divide movements into either single-joint or compound. In a single joint movement, only one joint bends. In a compound movement, more than one joint bends. All of our movement patterns are compound movements, except for the hip hinge. The other single-joint movements in the programs below are isolation movements for the arms and legs, such as arm curls, triceps extensions, and leg curls.

Machine set-up for single-joint movements. For properly setting up machines, I see the most errors made in setting up the single-joint movement machines. The principle here is simple. The axis of rotation of the machine should be aligned as perfectly as possible with the axis of rotation of the joint involved. For a leg curl machine, the pivot point of the lever arm on the machine should be aligned with the pivot points of both of your knees. Once you find this alignment, adjust the rest of the machine to support the rest of your body in a way that keeps the joint perfectly aligned. Being out of alignment will make the movement feel awkward (you’ll likely feel a sticking point in the movement where it suddenly becomes more difficult), and over time it will create repetitive sheer stresses on the joint that will increase the probability of injury. It may be the case that you find a machine where you can’t get properly aligned because your proportions and the design of the machine don’t allow it. In this case, select a different movement for that muscle. In the case of cable/pulley machines for curls and triceps, it is generally best to have the elbows slightly forward of your body, the upper arms vertical, and to imagine fixing your elbows in space as you perform the movement.

Alignment for pressing movements: 
For vertical presses, you want a straight, vertical line between your elbow, wrist, and the load, and this line is perpendicular to the floor. This ensures that you are putting force in the opposite direction of gravity. It is ok, and generally desired, for the elbows to be a bit in front of the line between the tips of your shoulders. This is because your shoulder joint rides on the scapula, and your scapula are not parallel to the plane that runs through the tips of your shoulders. So we want the arms aligned with the scapula (this is called the scapular plane), which is generally about 15-20 degrees forward of the shoulders. In some cases, it is even preferred to have the elbows forward of the scapular plane (this is true in Arnold presses, and usually in barbell shoulder presses). For the Smith machine and barbell overhead pressing, you will have to get the bar around your head, which also necessitates the elbows being forward a bit. As you get the barbell over your head, you will want to align it more closely with the shoulder joints, or else it will fall forward.

For incline and flat presses, you want your elbows, wrists, and the load stacked vertically. For incline presses, this means that the bar (or DBs, or your hands) will “touch” between the nipple line and the clavicle. How far toward one or the other depends on the angle of the incline; if the bench is flatter (10-30 degrees), you will touch lower on the chest; if the incline is steeper, you will touch higher on the chest. For a flat bench press, the bar will touch at or a bit below the nipple line, depending on your proportions.

On machines, for incline and flat pressing variations, if the seat is too low, you’ll feel like your hands are getting jammed into your shoulders. If the seat is too high, horizontal presses start to resemble dips and incline presses start to resemble horizontal presses. For machine vertical (shoulder) presses, if the seat is too high, your hands will get jammed into your shoulders. If the seat is too low, you won’t be able to lower the weight to at or just above shoulder height.

Stability. There is a trend in gyms, popularized by glossy magazines, glitzy websites,  and clueless trainers, that we should do exercises in unstable positions. The theory is that this helps work “those little stabilizer muscles,” which has apparently become a physiological term of art. Do not do this when holding a weight. We want to be as stable as possible so that we can safely and effectively apply force to the load in order to make our bodies stronger. This is especially true when holding a weight over your head, face, or throat. In general, if you are lifting a weight and your feet are on the ground, you want to imagine grabbing the ground with your feet and evenly distributing pressure across the whole foot, then building tightness in the legs and hips, then building tightness in the core and upper back, then gripping the bar as if you are trying to choke the life out of it. After you have done this, you are ready to move the weight. This is a skill that takes some time and practice to learn and takes years to perfect. The earlier we start, the better. If you want to work on balance as a skill, the best way to do that is to go to a yoga class or an adult gymnastics class.

Section 3: Three Example Programs.

These programs are based on the time you have available for lifting. In Section 5, I’ll talk about recommendations for cardio and HIIT to support overall fitness and health and how to integrate those alongside the lifting.

2 Days/week:
If you can only fit 2 days a week of lifting into the schedule, I recommend a full-body program where you hit all of the major muscle groups each workout. The advantage is that you hit every muscle twice a week, which is generally good, but the workouts are a bit long in terms of total sets and time, so you may find yourself running out of steam toward the end. I find that adding a few shakes of salt to my water bottle helps with this. Do this routine twice a week on non-consecutive days.

Legs:
Hamstring curls: warm-ups, then 3 sets of 15-20
A Squat movement: 3 sets of 12-20
A Hip Hinge movement: 3 sets of 12-15

Back:
A Vertical Pulling movement: 3 sets of 8-15 (*pull-ups exception to reps)
A Horizontal Pulling (rowing) movement: 3 sets of 12-15

Chest/Shoulders:
A Vertical Pressing movement: 3 sets of 12-15
Option 1: An Incline (10-30 deg from flat) Pressing movement: 3 sets of 8-12
Option 2: A Flat Pressing movement: 2-3 sets of 8-20
*Especially in the beginning, you can do either incline or flat. Some people’s shoulders will tolerate one better than the other. As you get in better shape, you’ll probably want to do both.

Optional (but recommended) Arms:
Curl variation: 3-5 sets of 12-20, supersetted with
Triceps variation: 3-5 sets of 10-20

3 Days/week:
This is a traditional push/pull/legs split where you’ll hit everything once per week with a little more volume each time. For getting back into lifting, you may feel like you want to hit it more often, but you’ll make 85+% of the progress once a week that you will twice a week, and the workouts are a bit shorter in terms of time, which will allow you to generate a bit more intensity. It’s a nice balance. For this program, we add a couple of accessory movements for joint health and mobility as well. And we get to do arms twice a week. You can do M/W/F, T/Th/Sa, however it works for you. You can also simply program this on a 6-day “week” alternating a lifting day with a non-lifting day.

Day 1: Upper Body Push/Arms
A Vertical Pressing movement: 3 sets of 12-15
An Incline (0-30 deg from flat) Pressing movement: 3 sets of 8-12
A Flat Pressing movement: 2-3 sets of 8-20
A Shoulder Lateral Raise movement (dumbbells or cables): 2-3 sets of 20
Curl variation: 3-5 sets of 12-20, supersetted with
Triceps variation: 3-5 sets of 10-20

Day 2: Legs:
Hamstring curls: warm-ups, then 3 sets of 15-20
A Squat movement: 3 sets of 12-20
A Hip Hinge movement: 3 sets of 12-15
Calf Press movement: 2 sets of 20-25
Tibialis Raises: 3 sets of 20-25

Day 3: Upper Body Pull/Arms
A Vertical Pulling movement: 3 sets of 8-15 (*pull-ups exception to reps)
A Horizontal Pulling (rowing) movement: 3 sets of 12-15
Face pulls: accumulate 100 reps with a light weight.
Curl variation: 3-5 sets of 12-20, supersetted with
Triceps variation: 3-5 sets of 10-20

4 Days/week:
This is the same basic routine as the 3 days/week, but we add a second pull day. The reason for this is that a large number of shoulder issues are a downstream result of doing too much pressing relative to pulling, which creates imbalances in strength around the shoulder. From an evolutionary perspective, primate shoulder anatomy evolved primarily for swinging from trees, not for lying on our backs pressing heavy weights. The cure is to do more pulling than pressing. With the extra day, we get 2x the pulling, and some additional work for shoulder health (of course you can add it to the 3 Day program, if desired). Program on either a 7-day or 8-day week.

Day 1: Upper Body Pull/Arms
A Vertical Pulling movement: 3 sets of 8-15 (*pull-ups exception to reps)
A Horizontal Pulling (rowing) movement: 3 sets of 12-15
Curl variation: 3-5 sets of 12-20, supersetted with
Triceps variation: 3-5 sets of 10-20
Face pulls: accumulate 100 reps with a light weight. *see note below
Dead Hangs: 2-3 rounds of max time (aim to progress to 60 sec each)

Day 2: Legs
Hamstring curls: warm-ups, then 3 sets of 15-20
A Squat movement: 3 sets of 12-20
A hip hinge movement: 3 sets of 12-15
Calf Press movement: 2 sets of 20-25
Tibialis Raises: 3 sets of 20-25

Day 3: Upper Body Pull/Arms
A vertical pulling movement: 3 sets of 8-15 (*pull-ups exception to reps)
A horizontal pulling (rowing) movement: 3 sets of 12-15
Curl variation: 3-5 sets of 12-20, supersetted with
Triceps variation: 3-5 sets of 10-20
Face pulls: accumulate 100 reps with a light weight.
IYWTVs: 2 rounds of 30-sec each *see note below
Dead Hangs: 2-3 rounds of max time (aim to progress to 60 sec each) *see note below

Day 4: Upper Body Push
A Vertical Pressing movement: 3 sets of 12-15
An Incline (10-30 deg from flat) Pressing movement: 3 sets of 8-12
A Flat Pressing movement: 2-3 sets of 8-20
A Shoulder Lateral Raise movement: 2-3 sets of 20

Additional Exercises:
I included hamstrings specifically at the beginning of leg day for three reasons: a large number of men our age with “lower back issues” really have hamstring issues that cause the lower back to compensate. Having strong, mobile hamstrings helps this a lot. Second, similarly, a large number of knee issues are caused by hamstring weakness relative to quad strength. Third, getting the blood flowing to the hamstrings will improve the quality of movement and effectiveness of your hip hinge and squat movements and reduce the chance of injury on those. It’s a win-win-win. I prefer the seated leg curl machine as my primary hamstring movement and I prefer higher reps (15-20) and volume. I’ve torn a hamstring in the past and I don’t like loading these heavy for that reason. I’ve done as much as 10 sets of 10 reps, 3x/week for months on these with great success. 

Face Pulls: These are for shoulder health. The exercise is gaining popularity in gyms these days, and guys are loading up a ton of weight and making it into a weird bicep-heavy standing-row thing. This defeats the purpose and adds a ton of injury risk. Here’s a very good video of proper technique. You’ll notice that the rather big guy demonstrating these is using a very light weight. This is the way. I use 20-25 pounds and accumulate 100 reps over as few sets as possible. The first set is typically 40-50 reps. These are for rear delts and rotator cuff strength – i.e. for shoulder health, and also help with upper back strength and posture. Also, notice his grip – very important. For some reason I almost never see anyone using the proper grip on these. Any other way of holding the rope attachment should feel very awkward.

IYWTVs: These are rotator cuff rehab exercises I learned during my shoulder surgery recovery. When I neglect them, I feel it. Here’s a decent explanation video. I do these lying on a bench face down rather than on the floor. I prefer to do the IYWs with palms down and the Ts with thumbs down and frankly I’m not sure why she teaches thumbs up. It creates some extra external rotation stress that isn’t necessary or helpful. My physios taught palm down for all variants. I also do “Vs” which is arms at 45 deg between legs and perpendicular, palms up. I prefer to do 30-second holds with a very light weight. Start with no weight. I’ve never used more than 3 pounds per hand. That’s plenty hard. Cycle through all five, resting as necessary, 2-3 rounds. Important tip: focus on keeping your upper traps (shrug muscles) relaxed throughout. That will force the right muscles to activate.

Tibialis raises: I got these from Ben Patrick (@kneesovertoesguy) • Instagram photos and videos. I highly recommend working his level zero knees over toes into your routine. Tib raises and backward walking are the entry points.

Dead hangs: I got these from physical therapists Bob and Brad on the YT. The idea is to accumulate 3-5 minutes of time hanging from a pull-up bar. They go over how it improves shoulder health and how to progress into it here.

Section 4: How to Start and How to Make Progress

Starting and Progression: I recommend doing 1-2 working sets of each exercise for the first couple of weeks. Try different exercises and figure out what feels best to you. Learn what your strength levels are. Add one working set every couple of weeks as your work capacity increases until you get to the sets listed.

Rep Schemes: I’ve made this a relatively high-rep program. I find that generally works better for me for hypertrophy and fitness as I get older, and usually results in less wear and tear on the joints. Peaking for absolute strength is a totally different game, and I strongly recommend against doing anything less than 5 reps, even “just to find out” for the first year or two. The exception to this is pull-ups. If you already have pull-ups it’s fine to use them in your back day at fewer reps per set than listed. Progress toward accumulating the listed reps over more sets. For example, if you already have five good pull-ups, accumulate 15 reps in 5 sets of 3. Then add one or two reps each workout. So the next workout might be sets of 4-3-3-3-3, then 4-4-3-3-3, etc.

Progressive Overload Always Applies:  It will take you a few workouts to find what weight gets you to failure on each exercise within the listed window of reps. That’s fine. Once you know, for example, that you can do lat pulldowns with 100 pounds and fail at 15 reps, then add 5 or 10 pounds the next workout. You may get 12 or 13 reps. Over the next couple of weeks, work back up to 15. Add weight again. This is a basic linear progression. It’s enough for the first couple of years of training. If you get bored in the 12-15 range, add more weight and spend a couple of months in the 8-12 rep range. Programming here isn’t specific enough that it really matters much. But progressive overload (doing a little more each workout) matters almost 100%. You will eventually get to a point where you plateau. Keep doing the sets and reps. Learn how to increase intensity. The way to break the plateau as a beginner/intermediate is to continue to accumulate sets and reps, not to change a bunch of stuff. Patience and consistency win.

Boring is Effective: Resist the urge to change things. You need consistency far more than you need exercise variety. Picking one movement for each pattern is enough for at least a year. Trying to “mix it up” or “confuse the muscles” will hamper progress in most cases. The exception is that as you (re)gain comfort in the gym, you may move from basic movements to intermediate, to advanced ones. Let the excitement result from tracking your personal best on each exercise. If you previously managed 10 reps with 100 pounds, and this week you do 11 reps, that is a new personal record! Celebrate the progress.

Keep the main thing the main thing. I have a blog post about how to frame training generally. TL;DR: in order of priority: Consistency, recovery, technique, intensity, programming.

Get appropriate help. For the advanced exercises, if you don’t have experience with them already (and maybe if you do), get someone to coach you. I have a blog post about how to find a competent trainer. Shoot me a note or a video if you need advice on a particular issue. Or if you want, you can always fly me out for a week, and I’ll teach you everything you need to know!  And it will be fun.

Section 5: Cardio, HIIT, Nutrition, Recovery, and Injuries

Cardio and High-Intensity Interval Training (HIIT). I find that about 2 hours of zone two (HR ~120) incline walking per week, and 1-2 bouts of high-intensity intervals per week on a bike or rower make everything else better. For the intervals, I rotate around doing longer intervals (1:00 work/2:00 recovery, 5-8 rounds), medium intervals (:30 work/1:30 recovery, 5-8 rounds), and short intervals (:20/:10 work/recover, 8 rounds; :10/:20 work/recover, 8 rounds). You can do this on non-lifting days or on lifting days. If doing cardio on lifting days, it doesn’t much matter whether you do your Zone 2 work before or after lifting (I prefer after). If you do your HIIT training on a lifting day, do it at the end of your workout.

Nutrition. Talking about nutrition runs the risk of creating a religious war, but I’ll do it anyhow because it is important. In general, no matter what your goals are, more protein in the diet is better. If you reliably eat 1 gram of protein per pound of body weight per day, you will be amazed at how quickly you will get results. If your body fat percentage is above 20% for men or 30% for women, you can eat 1 gram of protein per pound of lean body mass (total body mass – fat mass). A popular objection/fear is that eating too much protein can interfere with kidney function. This has been debunked for people with healthy kidneys. The other frequent objection is that it’s hard to eat that much protein from real food. This can be true, especially if you aren’t used to it. But I find that it’s generally a good thing when I am both struggling to eat enough food and losing body fat. It beats the barely-losing-body-fat-while-struggling-not-to-gnaw-my-arm-off feeling of low-protein, low-calorie diets. Of course, there are many competing hypotheses in the nutrition world. I find I feel best eating my protein from real food sources. Save protein shakes for rare emergencies.

Recovery. I talked about recovery in some detail in my hierarchy of fitness priorities post.  In order of importance for recovery: sleep, nutrition, mobility work, hot/cold treatments, bodywork (massage, etc.). For many people in their mid-30s and older, especially men, sleep issues may be present and undiagnosed. I highly recommend getting a sleep study (they can be done at home now) if you know, or your partner tells you, that you snore and/or stop breathing at night.

On Injuries. If you have old injuries, talk to your doctor or, probably better, an experienced physical therapist and seek their advice on how you might modify your exercise selection or form to accommodate your particular set of battle scars. Luckily, being active again will likely reduce some of the stiffness from old dings and you will likely find that you regain some lost range of motion as well.

Most of us will experience some sort of pain as a result of being active. As they say, accumulating injuries is the price of not spending life sitting on the couch eating Cheetos. DOMS, or delayed-onset muscle soreness, is to be expected from any effective fitness routine. Its degree seems to vary greatly among individuals. I typically have some DOMS from every lifting workout, even with many, many years of experience. The best fix for DOMS seems to be movement and getting the blood flowing into the sore muscles. For this reason, it is not necessary to delay your next workout until the soreness has faded.

Occasionally, you may get hurt. I think of this in three categories: hurt, injured, and broken (with apologies to Dave Tate). Roughly, “hurt” is a minor tweak or strain that goes away in a few days to a week, or some mild tendonitis. I generally work through being “hurt” as motion and blood flow help the body to heal through the normal range of motion and reduce scar tissue accumulation. “Injured” is a muscle pull or tear or a joint issue that is bad enough that it can’t be worked through and requires some rest. Usually we can work around this injury and maintain most of our fitness while seeking some physical therapy for the injured area. “Broken” is worse – this is a surgery-level injury that requires major modifications to our training plan and may curtail it all together for a period of time. Hopefully, this won’t happen, but sometimes it does. The good news is that in a beginner program with moderate to high reps and moderate weight such as this one, it is unlikely to have something worse than a minor “hurt” occur.

Bonus Section: Fitness Myths for the Gen X Athlete

Myth 1: You can’t build muscle and strength if you are xx years or older.
Busted: This is a popular misconception, along with the belief that after about 40, the best you can hope for is to slow your decline. I disagree, and so does the research. It’s never too late to start. Building muscle and strength are extremely tightly correlated (and probably also causally related) to both longevity and health span (how long can we remain active and independent, roughly). My personal example is that in early 2023, at the age of 52, I reached a strength goal in the bench press that I had been pursuing for most of 10 years, and that was a lifetime personal record by almost 10%. In late 2022, I made a lifetime-best deadlift as well. And somewhere in between there, I made a new personal record for pull-ups, performing 8 more reps than I did when I was a Division I competitive swimmer, and weighing about 65 pounds more than I did in college. The best day to get started was yesterday, the second-best day is today.

Myth 2: Doing xx exercise is inherently dangerous.
Busted: This one is usually applied to exercises like squats and deadlifts, and also frequently to other compound barbell lifts. While all exercise carries some element of risk because it is inherent in the nature of reality that risk is never zero, most exercises can be performed in a way that the benefit greatly outweighs the risk. There may be specific injuries that limit you from performing specific exercises, but there will definitely be at least one exercise from each movement pattern that you can do safely and effectively.

Myth 3: This one is for the ladies. It goes like this: I’m worried that if I lift heavy weights, I’ll get “bulky.”
Busted: This is the first cousin of “I just want to get toned.” Strength training causes two main adaptations in muscle tissue: hypertrophy (growth of the muscle itself) and increases in strength (the ability of the muscle tissue to produce contractile force). The two are related. In most cases, hypertrophy also causes an increase in the ability to produce force. However, one can train in a way to increase strength with little to no hypertrophy. It is really hard to build a large amount of muscle mass (“get bulky”). For natural/unenhanced men (men who are not using anabolics), one can realistically only expect to gain about one pound of muscle per month after the first year of training (an untrained man may be able to gain about 20 pounds of lean mass in the first year). For unenhanced women, it’s probably about half of that. There are genetic factors that apply here. Some can gain a bit more; some will struggle to gain a bit less. The vast majority of women who appear “bulky” due to muscle mass are enhanced. With hard training and on-point nutrition, a natural woman may gain about 5 pounds of muscle mass in a year, and due to having much less testosterone than men or enhanced women, her upper limit of muscularity won’t result in a bulky look. If she also loses 5 pounds of fat mass, the change in her appearance toward athleticism and health will be obvious to everyone around her. And her confidence and mood will improve as well, both from the results and from pride in the discipline it took to achieve.

Myth 4: The exercise science says I should do x or y or some other thing.
Busted: I’m a huge fan of the scientific method. And I think there are some significant problems with how we do research these days. These are topics for another post. My two basic observations on peer-reviewed science and exercise are (1) you can find a research paper to support any crazy thing you want to do, so that’s nice and also not particularly useful, and (2) in sports science, practice tends to lead research by decades. That is, Usain Bolt’s coach knows how to train athletes to run fast, and researchers are frequently coming in behind the events to figure out the physiological mechanisms that cause the methods to work. Find what works for you and trust it. Modify if/when necessary. To fully evaluate a program takes about a year, so don’t change lots of variables quickly. As a beginner, follow a beginner programmer for at least a year before you change anything. After a year or two, you might know enough to try changing one thing. Be patient. It’s the long game that matters.

Conclusion.

Good luck getting back into the gym. This program will put you into the top 10% of fittest people our age very quickly, and if you want to progress to more advanced training it will have provided a solid foundation. As with anything in life, there is always more to learn, and this rather long blog post is far from comprehensive on the topic. Let me hear how it goes and send me any questions you might have.

In strength,
Scott

Resources:

For excellent video demonstrations of most of the exercises listed in this program, follow Alexander Juan Antonio Cortes (AJAC) on Instagram here. I’ve met AJAC, and I’ve followed his blog for years. Although we have some minor differences of opinion, he knows his stuff and I trust his advice. I also recommend his programs.

For excellent video demonstrations of some of the exercises and very good instructional breakdowns of the underlying movement principles, follow Juggernaut Training (Chad Wesley Smith and Marisa Inda). Chad and Marisa are both international elite powerlifters. They have a mix of foundational and advanced videos. Their series on effective core bracing is especially good.

EliteFTS.com (YouTube channel here) has hundreds of videos from a large variety of experts on all topics of lifting. Their material tends to be more tailored for advanced lifters, but they do have some good introductory material as well. It’s worth keeping in mind that most of their experts are enhanced. Recommended for those who are contemplating getting into more advanced lifting or competing in strength sports.

AI Safety: A Partial Critique of Russell’s “Provably Beneficial AI”

With all of the recent advancements in AI, especially the November release of ChatGPT and last week’s upgrade of the underlying GPT engine to version 4, there is a lot of ongoing discussion about AI Safety.

AI Safety is a term that has come to mean a lot of things, and as such, it requires some further elaboration so we know what we are actually talking about. According to Wikipedia (Mar 21, 2023): “AI safety is an interdisciplinary field concerned with preventing accidents, misuse, or other harmful consequences that could result from artificial intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to make AI systems moral and beneficial, and AI safety encompasses technical problems including monitoring systems for risks and making them highly reliable. “

For this post, I’ll be having a conversation with ChatGPT 4 to provide an introduction to one concept that lives within AI Alignment. The concept is called “Provably Beneficial AI” and has been proposed, refined, and advocated primarily by Stuart Russell of the University of California at Berkeley, where he is the Director of the Center for Human-Compatible AI, a professor of Computer Science, and the Smith-Zadeh Chair in Engineering. He literally wrote the book on AI.

I did the interview twice, once with the GPT 3.5 engine and once with the GPT 4 engine. GPT 4 generally provided more detail in its responses, with a strong tendency to use numbered lists. GPT 3.5 was more conversational, but provided less detail. I used the same prompts for both, and since I did GPT 3.5 first, the GPT 4 prompts might feel a little clunky, but I wanted to be able to compare the two and get a feel for the differences. In the interview below, I am S: and ChatGPT-4 is GPT.

Enjoy!


S: Briefly explain Russell’s concept of “provably beneficial AI”

GPT: Russell’s concept of “provably beneficial AI” refers to the development of artificial intelligence systems that can be mathematically proven to act in ways that are beneficial to humans. This concept aims to ensure that AI systems consistently work towards the best interests of humanity, avoiding unintended negative consequences.

In order to achieve this, the AI must be designed with certain principles in mind, such as uncertainty about its objectives and deference to human preferences. By constructing AI systems that acknowledge their own uncertainty and learn from human input, we can work towards creating AI that remains aligned with human values and provides benefits without causing harm.


S: What is the strongest critique of Russell’s “provably beneficial AI?”

GPT: One of the strongest critiques of Russell’s “provably beneficial AI” concept is the challenge of defining and quantifying human values and preferences. Human values are complex, diverse, and often contradictory, making it difficult to create a universally accepted model that can guide AI systems in a provably beneficial way. Some specific concerns include:

  1. Value alignment: Aligning AI systems with human values is challenging because values can vary significantly between individuals, cultures, and over time. It is unclear how an AI system can be designed to account for these variations and adapt to changing preferences without risking unintended consequences.
  2. Incompleteness and ambiguity: Human preferences are often incomplete or ambiguous, which makes it hard for AI systems to make decisions that are guaranteed to be beneficial. A provably beneficial AI needs a way to handle such situations while remaining aligned with human values.
  3. Preference manipulation: A learning AI system designed to be deferential to human preferences may inadvertently encourage humans to express preferences that are not in their long-term best interests, leading to potential manipulation of the AI system for short-term gains.
  4. Moral dilemmas and ethical trade-offs: Real-world situations often involve moral dilemmas and ethical trade-offs that can’t be easily resolved. Designing a provably beneficial AI that can navigate these complex scenarios in a way that respects diverse perspectives and ethical considerations is a significant challenge.
  5. Technical difficulties: The development of provably beneficial AI requires significant advancements in AI safety research, including robust methods for ensuring AI systems learn and follow human values, even as they become more capable. This is a difficult technical problem that has not yet been fully solved.

S: Didn’t Russell specify a big data approach to defining human values and objectives by essentially looking at what decisions humans made historically and inferring the underlying values and objectives?

GPT: Yes, Stuart Russell did suggest an approach to defining human values and objectives by examining the decisions humans have made historically, with the idea that AI systems can learn these values and preferences through a big data analysis. This approach is part of the broader concept of “inverse reinforcement learning” (IRL), where AI systems learn to infer the underlying objectives or values that drive human decision-making by observing human behavior.

However, despite the promise of this big data approach, it is not without challenges and limitations. Some concerns include:

  1. Biases and inaccuracies: Historical data can be biased or incomplete, reflecting cultural, social, or political norms of a particular time period. These biases may not accurately represent the values and objectives of all humans, potentially leading to AI systems that reinforce existing inequalities or injustices.
  2. Changing values: Human values and preferences evolve over time, and what might have been considered acceptable in the past might not be considered acceptable today. AI systems need to be adaptable and able to recognize these shifts to remain beneficial.
  3. Overgeneralization: Inferring human values and objectives from historical data can lead to overgeneralization, where AI systems may not be able to distinguish between genuinely shared values and preferences that vary between individuals, cultures, or contexts.
  4. Incompleteness and ambiguity: As mentioned earlier, human preferences can be incomplete or ambiguous, making it difficult to infer clear, universally applicable objectives from historical data.
  5. Ethical dilemmas and trade-offs: Learning human values from historical data does not necessarily provide AI systems with a clear framework for resolving complex ethical dilemmas or making trade-offs between competing values.

Overall, while Russell’s big data approach to defining human values and objectives offers a potential path forward, it is not without significant challenges that need to be addressed in order to develop truly provably beneficial AI systems.


S: Isn’t there a more fundamental problem with this big data approach? Specifically, that there are many cases where humans were forces to choose a course of action from a set of possibilities, all of which were bad options and not aligned with our values. For example, it is hard to argue that Truman’s decision to use nuclear weapons against Japan was his “preference,” but neither was a mass invasion of the Japanese home islands. In fact, all really hard ethical decisions are between bad alternatives, so using a big data approach to categorize the decision that was made as a “preference” seems to be very problematic, if we want AI to avoid making the same types of mistakes as humans.

GPT: You raise a valid concern about the limitations of the big data approach to defining human values and objectives. Indeed, there are situations where humans face difficult ethical decisions and must choose between suboptimal options, none of which align perfectly with their values. In these cases, inferring preferences from historical data may not accurately represent human values and could lead to AI systems making similar mistakes or perpetuating undesirable outcomes.

This fundamental problem highlights the importance of considering additional factors when designing provably beneficial AI:

  1. Context sensitivity: AI systems should be able to understand the broader context of human decision-making and recognize that certain decisions may have been made under duress or constrained by limited options, rather than as a true reflection of human preferences.
  2. Ethical reasoning: AI systems should be equipped with robust ethical reasoning capabilities that allow them to navigate complex moral dilemmas, even when historical data may not provide clear guidance on the best course of action.
  3. Human guidance and collaboration: Developing AI systems that work closely with humans and actively seek their input in situations with ambiguous or conflicting preferences can help ensure that the AI’s actions remain aligned with human values.
  4. Ongoing learning and adaptation: AI systems should be able to learn from new information, societal shifts, and evolving human values, allowing them to update their understanding of human preferences and make better decisions over time.

In summary, while the big data approach to inferring human values and objectives has its limitations, it can still be a useful starting point for understanding human preferences. However, to develop truly provably beneficial AI, it is crucial to combine this approach with other techniques that address the inherent complexity and nuance of human decision-making and values.


S: Human guidance and collaboration seems to imply that moral values are relative to who we ask. Is Russell’s approach inherently one of moral relativism?

GPT: Stuart Russell’s approach to AI value alignment does not explicitly endorse moral relativism, but it does recognize the diversity and complexity of human values, preferences, and cultural norms. By emphasizing the importance of human guidance and collaboration, Russell’s approach aims to create AI systems that respect and adapt to this diversity, rather than adhering to a single, predefined set of moral absolutes.

In practice, this means that AI systems designed using Russell’s approach may exhibit a degree of moral relativism, as they attempt to align with the values of the humans they interact with or the societies they operate within. However, the goal is not to embrace moral relativism for its own sake, but rather to ensure that AI systems are beneficial and aligned with human values, which can be diverse and even contradictory.

It’s important to note that balancing between moral relativism and moral absolutism is a challenge when designing AI systems. To create provably beneficial AI, researchers and developers must grapple with the complexities of human values and ethics, and strike a balance that respects diverse perspectives while still upholding some level of universally accepted moral principles.

S: How might we decide what are universally accepted moral principles? Who should help decide that? It seems like the greatest philosophical minds of all time have been working on this problem since before Plato, and we haven’t reached any agreement yet.

GPT: You are correct that determining universally accepted moral principles is a complex and longstanding challenge, one that philosophers have been grappling with for centuries. While there may never be a complete consensus on universally accepted moral principles, there are some approaches to consider when determining which principles to incorporate into AI systems:

  1. Cross-cultural consensus: Analyzing moral principles and values that are widely shared across different cultures and societies can help identify common ethical themes that may serve as a basis for universally accepted principles.
  2. Interdisciplinary collaboration: Involving experts from various fields, including philosophy, ethics, sociology, psychology, and computer science, can help develop a more comprehensive understanding of moral principles and their applicability to AI systems.
  3. Public involvement and deliberation: Engaging a diverse range of stakeholders, including the general public, in discussions and deliberations about moral principles and AI ethics can help ensure that multiple perspectives are considered and that the resulting principles are more widely accepted.
  4. Iterative refinement: Recognizing that universally accepted moral principles may evolve over time, it is important to establish ongoing processes for revising and updating these principles as societal values and norms change.
  5. International frameworks and guidelines: Building on existing international ethical frameworks, guidelines, and conventions related to human rights, environmental sustainability, and other global concerns can provide a foundation for developing universally accepted moral principles for AI systems.

Ultimately, determining universally accepted moral principles for AI is a collaborative and ongoing process that requires input from a diverse range of stakeholders, from philosophers and ethicists to policymakers and the general public. While achieving perfect consensus may be impossible, striving for a shared understanding of core moral principles can help guide the development of AI systems that are more broadly beneficial and aligned with human values.


Overall, not a bad summary of the goals and some of the potential pitfalls of Russell’s approach.

A Funny Thing Is Happening with ChatGPT

The latest version of ChatGPT, (based on GPT-4) is about a week old, and already something funny is happening.

A Good Start

People are finding some pretty cool ways to use ChatGPT to help with their normal work.

One user gave ChatGPT a description of his company’s mission and a rough job description; GPT wrote a formal job description, and it was posted to a job board minutes later.

Another user had GPT interview him and write content based on the interview.

One friend had it draft a wealth management plan.

Another had it build the financial policy manual for his company, including lists of reports and deliverables, deadlines, and cost estimates.

People have figured out that they can get GPT to change its writing style by providing examples, and that they can even have it to write legal documents.

Several people have had it draft marketing emails and web site copy to sell their products.

For the first few days, it seemed the sky was the limit with this amazing new tool.

And Some Frustration

In the last couple of days, I’ve increasingly been hearing about problems. Users are getting frustrated with the tool misbehaving and doing unexpected things.

In one case, for a user who was having it create budgets and financial plans, it started changing hourly rates and budget lines on its own. When confronted, it apologized profusely for the mistake.

Other users have been frustrated that it seems to change its mind if you ask it the same question more than once.

One guy was puzzled when he asked it to cut and paste some text into a new response, but it wrote a new paragraph on the same topic instead.

And of course, some users have tricked it into doing nonsensical things – like writing a set of instructions for patching a hole in the wall of a house made entirely of cheese.

What’s Happening?

What’s happening here? Is GPT misbehaving? Making mistakes? Deliberately undermining its users?

Without getting into the technical explanation for each of the types of “errors” above (and there are relatively simple explanations based on how GPT works for each of them), I think there is a meta-error happening. And we, the users, are the ones making it.

We have been so impressed with our initial experiences that we are treating GPT as something like an experienced Executive Assistant and expecting that level of performance from it. It is performing well enough that when it misses the mark, we are getting frustrated with it in the same way we would if it were human.

The problem is, GPT isn’t built to be an Executive Assistant. It is built to be a language model. What that means is that it does one thing very well – it predicts what word should come next given the previous couple thousand words, based on a corpus of about 500 billion words of written language (for more on how it does that, see my previous blog post here.).

It’s worth repeating for emphasis: the only thing GPT does is generate text that looks convincing by modeling a statistical process to predict what word should come next.

What it doesn’t do is… everything else. It doesn’t build plans. It doesn’t apply logic. It doesn’t understand meaning. It doesn’t understand consequences. It doesn’t do rigorous research. It doesn’t formulate new ideas.

But GPT’s text generation model is good enough that it looks like it does those things. The training data includes many, many examples of those types of writing, so it produces output that is convincing and might even be reasonably accurate. But any accuracy or inaccuracy GPT provides is a side effect of what is in the training data. 

So, if you ask GPT for a financial plan, it generates text that looks like a financial plan. If you ask it to explain the logic of Zeno’s Paradox, it generates text that looks like an essay on that topic. If you ask it to explain what it means to be alive, it generates text that looks like an answer to that question. If you ask it to provide references on a certain topic, it generates text that looks like a list of references (some of them are decent references, and some of them are rubbish). 

What we are seeing is a disconnect between what we expect GPT to do and what it is actually capable of doing. This disconnect seems to be present in large part because GPT sounds so human that we expect it to behave like a human. GPT may be a victim of its own success. But it turns out that humans do a lot more than generate streams of words that sound human. We are expecting way too much from GPT. 

It is important to understand what GPT does, and what it doesn’t do. The reactions of some pretty smart people experimenting with the tool suggest that together, we (the users and OpenAI) haven’t done well at communicating the limitations of GPT. I hope we become more educated users soon – before we start treating GPT as if it were an authority on any important topic.

Perhaps GPT has passed the Turing Test, at least well enough that we are beginning to treat it as if it were an underperforming co-worker. It will be interesting to see what happens over the next few months.

What is ChatGPT and How Does It Work?

What is ChatGPT?

ChatGPT has been in the news a lot recently, and we’re going to hear a lot more about it in the near future. GPT stands for Generative Pre-trained Transformer, meaning that it uses a transformer network (a type of Deep Learning neural network) trained on a very large corpus of text to produce human-like text based on a prompt. GPT is a type of Large Language Model (LLM) and there are several other competing models in various states of development. One way to think of this is that GPT is a neural network implementation of a predictive text algorithm.

ChatGPT adds an interactive use interface to the front end of the GPT-3 model, allowing users to quickly and easily interact with the tool, and even allowing users to fine-tune how it responds based on previous interactions and examples of different writing styles. Give ChatGPT a prompt, and it will build an answer, an essay, or even a joke on a given topic by “analyzing” what has been written before in its training data and any data you have provided, and giving you a set of predictive text on the topic.

This post will break down what a predictive text algorithm is generally, what neural networks are and how they work, why ChatGPT is built using a neural network, and some background on ChatGPT’s development and how the underlying models were trained. The next posts will begin to look at some of the problems with LLMs, including GPT, and with ChatGPT specifically.

Predictive Text Algorithms

One way to conceptualize a predictive text algorithm is to imagine looking at every book that has ever been published (GPT also uses a much larger body of internet text, which presents additional problems that I’ll look at in a future post) and building a table of probabilities for every common English word (there are about 40,000 or them) that captures the odds that of what the next word will be. We get a table of 40,000 x 40,000 words with each cell filled with a probability representing how often in the training text the second word follows the first. As you might imagine, many of the entries will be zeroes.

With this table in hand, we can “prompt” the table by giving it a keyword, and then hop through the table by picking the entry with the highest probability of being the next word. If we ask our model to generate text based on the prompt “cat” we might get something like “cat through shipping variety is made the aid emergency can the.” Hardly a useful sentence.

In order to get an output that makes more sense (seems more like something a human would write), we might extend our table to include the probabilities for the most likely next word given the previous two words. Our output in this case would be a little better, but now our table has 64 Trillion (40,000^3 entries), rather than just 1.6 Billion (40,000^2). If we want to continue to improve the output of our algorithm, we need to continue it for greater depth of the-words-that-came-before. The problem with this approach is that it becomes too computationally expensive before it starts to produce good quality text. (For example, if we want to consider the previous 13 words, still not enough to produce a coherent essay-length output, we need a table with as one entry for each atom of hydrogen in 100 stars the size of the sun.) This problem is complicated by the fact that we also don’t have enough written text to fill out much of the table at that depth.

Because of these limitations, we need to find a better approach. We need an approach that is both more computationally efficient, and one that can achieve reasonable results using the training data we do have. GPT (and all Large Language Models, LLMs) uses a neural network to estimate the probabilities that we don’t have the computational capacity and necessary volume of training data to build traditionally.

Neural Networks

The idea of Neural Networks dates back to 1943, when Warren McCulloch and Walter Pitts formulated a simple model of networks of neurons and attempted to analyze the model mathematically. The first discussion of using neural networks as a possible approach to artificial general intelligence (AGI, or teaching machines to think “like” humans) was in the 1948 paper Intelligent Machinery by Alan Turing.  In 1949, Donald Hebb argued that such networks would be capable of learning. Computer simulations of neural networks began in the 1950s, but it soon became apparent that we didn’t have enough computing power to model enough neurons at a fast enough speed to do anything interesting. Neural Networks were put on the back burner of computing and AI for a few decades. They were studied rigorously again in the 1980s and 90s, but researchers decided that the only interesting things they could do were things that could be done more simply with probabilistic models. Neural networks finally broke out of the background between 2009 and 2012 when Swiss researchers successfully applied recurrent neural networks and deep feed-forward neural networks to win a series of international pattern recognition and machine learning competitions.

Neural networks work by simulating the type of computation that biological neurons perform. You can think of a simulated neuron as having a set of inputs (analogous to the connections (dendrites) coming to a biological neuron from other neurons); a function that converts those inputs to an output value (analogous to the body of the neuron); and a set of output connections that carries that output to some number of other neurons. Each output connection we can think of as a signal with a particular strength, or weight. In a neural network, these units are usually organized into layers, which are frequently (but not always) two-dimensional arrays of interconnected neurons. A number of layers are connected in ways that generate the type of output we are seeking. There are many different types of connection topographies within layers, many different types of layer architectures, and many different ways of “stacking” the layers. ChatGPT, which is built on GPT 3, uses something like 10 million neurons with about 175 billion connections arranged in about 400 layers.

Training the neural network is done by feeding it lots of data, and providing it feedback in the form of a difference signal from the desired output (at the final output layer, or sometimes at a layer close to the final layer). The network then adjusts the weights across the entire network in order to try to reduce the distance between the actual output and the desired output. As it turns out, it takes about as many training data runs as there are connections in the network to “fully” train the network. In the case of GPT-3, that means about 175 billion words of training data.

GPT-3 was trained on about 500 billion words of text. (Technically, tokens, which differ from words a little bit, but not enough to matter for our purposes.) The sources of that data are described in the table below (from Wikipedia):

Source: Wikipedia

Common Crawl is a dataset produced by the Common Crawl non-profit organization that contains over 3 billion we pages (as of Oct 2022). WebText2 is a corpus containing all Reddit submissions from 2005 through April 2022. Books1 and Books2 contain digitized published books. You may notice that the data sets are weighted withing the model in a way that isn’t the same as their token count. *Note: for a more detailed account of exactly how ChatGPT works, and an excellent overview of the underlying technologies, see this post by Stephen Wolfram: What Is ChatGPT Doing … and Why Does It Work?—Stephen Wolfram Writings

One of the things that neural networks are good at is generalizing patterns from very large, but incomplete data sets, more quickly and sometimes more usefully than a brute-force algorithm can accomplish. And that’s what ChatGPT does – it is a neural network that generalizes patterns from a large corpus of text and iteratively predicts what word should come next given the previous words (including the text it generated). By doing so, it has gotten to a level of capability that looks like useful human-written text in many ways, though it still has some important and significant limitations.

In the next post, I’ll look at some of the problems with GPTs and LLMs generally, and some of the specific issues with OpenAI’s approach to ChatGPT (and the underlying GPT-3 and -4 models).

Update:

As I was editing this post, OpenAI has released GPT-4 to the public. OpenAI has chosen not to reveal the specifics of the neural network implementation of GPT-4, but has promised that it will be “safer and more useful” than previous versions. Rumors, which have been denied by OpenAI’s founder, suggest that the model is on the order of 1000x bigger than GPT-3.